j.- n j , m a 1 v^'ir iCatih ■■ j ■ j r 

: in 1 '■ ■ j n f j r t 

- CSCL 09* 


Hl/62 





hhl 1 



NASA Contractor Report 4403 


The Formal Verification 
of Generic Interpreters 


P. Windley and K. Levitt 
University of California 
Davis, California 

G. C. Cohen 

Boeing Military Airplanes 
Seattle, Washington 


Prepared for 

Langley Research Center 
under Contract NASI -18586 


IWNSA 

National Aeronautics and 
Space Administration 

Office of Management 

Scientific and Technical 
Information Program 

1991 




Preface 


This document was generated in support of NASA contract NAS 1-18586, Design and Validation of 
Digital Flight Control Systems suitable for Fly-By- Wire applications. Task Assignment 3. Task 3 is 
associated with formal verification of embedded systems. In particular, this document contains results that 
provide a methodological approach to microprocessor verification. A hierarchical decomposition strategy for 
specifying microprocessors is also presented. A theory of generic interpreters is presented that can be used to 
model microprocessor behavior. The generic interpreter theory subtracts away the details of instruction 
functionality, leaving a general model of what an interpreter does. 

The NASA technical monitor for this work is Sally C. Johnson of the NASA Langley Research 
Center, Hampton, Virginia. 

The work was accomplished at Boeing Military Airplanes, Seattle, Washington, and the university of 
California, Davis, California. Personnel responsible for the work include: 

Boeing Military Airplanes: 

D. Gangsaas, Responsible Manager 
T.M. Richardson, Program Manager 
G.C. Cohen, Principal Investigator 

University of California: 

Dr. K. Levitt, Chief Researcher 
P. Windley, Ph.D Candidate 


r VV 




id i 


PRECEDING PAGE BLANK NOT FILMED 


iv 


Contents 


1 Introduction 

1.1 Abstraction 

1.2 Main Ideas 

1.2.1 Hierarchical Decomposition. . . . 

1.2.2 Generic Interpreters 

1.2.3 Composition of Verified Modules. 

1.3 Contributions 

1.4 Formal Proofs 

1.5 Notation and Conventions 

1.6 Chapter Summaries 

2 Previous Work 

2.1 Sequential Circuit Verification 

2.2 Microprocessor Verification 

2.2.1 Tamarack 

2.2.2 FM8501 

2.2.3 VIPER 

2.2.4 SECD 

2.2.5 Comparison 

2.3 Generic Theories 

2.3.1 OBJ 

2.3.2 EHDM 



1 

2 

3 

3 

4 

5 

5 

6 
7 
7 

9 

9 

10 

11 

12 

12 

14 

14 

15 

16 
17 


PRECEDING page blank not filmed 


2.3.3 HOL 17 

2. 3. 3.1 The Language 18 

2.3.3. 2 The Proof System 20 

2. 3. 3. 3 Generic Theories in HOL 21 

2.4 Using Logic to Specify Hardware 24 

2.4.1 Specifying Circuits with Predicates 25 

2.4.2 Specifying Sequential Behavior 27 

2.4.3 Abstraction and Specification 28 

3 Interpreters 31 

3.1 Hierarchical Decomposition 32 

3.1.1 The Hierarchy 32 

3.1.2 Hierarchical Verification 34 

3.2 Interpreter Hierarchies 36 

3.3 A Mathematical Definition of Interpreters 37 

3.3.1 Basic Types 37 

3.3.2 State 38 

3.3.3 Time 38 

3.3.4 State Streams 39 

3.3.5 Environments 40 

3.3.6 The Interpreter Specification 40 

3.3.7 Interpreter Verification 41 

3.4 Composing Specifications 41 

4 The Formal Models 45 

4.1 Synchronous Interpreters 45 

4.1.1 The Abstract Representation 46 

4.1.2 The Theory Obligations 48 

4.1.3 The Correctness Statement 50 

vi 


A O 


52 

4.2 

Asynchronous inxerpreter& 

53 


4,^,1 Xeinporai 

55 


4.2.2 lne ADstraci itcprcsciiiauuii 

56 


4 . 2.0 lne ineory wDiigaxion£> 

57 

A O 

4.2.4 lne v^orrecxness 

r\ 

58 

5 A Verified Microprocessor 

v t i tm/ A rralmn 

61 

62 

0.1 

Si V M—l 5 nrcnucciuic aim 

63 


0.1.1 An Arcnitecxurai view 

63 



63 


0.1.1.-M i fie ncgioicio. 

65 


0. 1.1.0 me insiruetiun 

76 


0.1. 1.4 selecting 

76 


0.1.2 An ur^fiHiZaiioDai view. 

78 


0.1. z.i i ne si v ivi i xyatapam 

c 1 O O TVia Pnn+rnl TTmf 

79 


o.l.z.z x ne vjonirui uxux 

84 


o.i.Z.o x nxii ng 

85 



85 



87 

5>2 

si V M ~1 s x ormai opecmc<uiuu 

88 


0.2.1 A x neory oi ADsiroti 

r n o «. iV, a 1?lA/.^rrLni/* Rlnrlr MnnPl 

93 


0 . 2.2 JJenmng me Dieciroiut 

coo iVa PVi oco T ,pvp1 

Ill 


0 . 2.0 jjenmng me masc xcvci. 

119 


0.2.4 penning xne 

5.2.4. 1 The Microcode Assembler 

E O A O TV> o A>Ti rrrki nclrnrf i ATI Q 

119 

125 


0,^.4./ xne ivxicrom&xi 

C O C r\a£rtin/r |Vt a \j4ir , T*0 T ,PVPI 

128 


0.2.0 JJenmng me xviicro 

vii 



5.2.6 Defining the Macro-Level 132 

5.2.7 Observations 138 

5.3 AVM-fs Formal Verification 141 

5.3.1 Instantiating the Generic Interpreter Theory 141 

5.3.2 Verifying the Phase Level 143 

5.3.3 Verifying the Micro-Level 150 

5.3.4 Verifying the Macro-Level 161 

5.3.5 AVM-1 Is Correct 167 

5.4 Observations 169 

6 Summary 171 

6.1 Summary of Major Results 171 

6.2 Future Work 173 

6.3 Conclusion I74 

References 175 

A Abstract Theories in HOL 179 

A.l Abstract Theories 179 

A. 1.1 Using the Abstract Theory Package 180 

A. 1.2 Abstract Representations 180 

A. 1.3 Theory Obligations 181 

A. 1.4 Instantiating Theories 181 

A.2 Implementational Considerations 182 

A.3 Limitations I84 

A. 4 Implementing Abstract Theories in HOL 186 

B The Organization of the Proof 191 

B. l Proof organization 191 

B.2 Proof Metrics 

viii 



List of Figures 


1.1 A microprocessor specification can be decomposed hierarchi- ... 4 

cally. 

2.1 Implementation of a simple circuit, D 26 

3.1 An interpreter has a flat control structure 32 

3.2 A microprocessor specification can be decomposed as a series . . . 33 

of interpreters. 

3.3 A hierarchy of interpreters 36 

3.4 Interconnecting interpreters in the hierarchy. 37 

3.5 A temporal abstraction function maps time at one level to time ... 39 

at another level. 

4.1 The function T ■, which maps time at one level to another, can ... 53 

be defined in terms of a predicate, Q , which is true only when 

the mapping occurs. 

5.1 The instruction formats in AVM-1 66 

5.2 The AVM-1 Datapath 77 

5.3 The AVM-1 Control Unit 80 

5.4 The clock signals in AVM-1 84 

5.5 A PERT phase diagram for AVM-1 84 

5.6 The ALU Specification for AVM-1 100 

5.7 The specification for the datapath (continued on next page) 103 

5.8 The specification for the datapath (continued) 104 

5.9 Phase four of the phase-level interpreter (continued on next page). 114 

5.10 Phase four of the phase-level interpreter (continued) 115 

ix 


5.11 The generic interpreter theory can be instantiated with defini- . . . 142 
tions of the various levels from the hierarchical decomposition 

to yield a proof of the microprocessor. 

5.12 Instantiating the abstract theory for the phase-level 149 

5.13 Instantiating the abstract theory for the micro-level 159 

5.14 Instantiating the abstract theory for the macro-level 166 

B.l The theory hierarchy for the proof of AVM-1 192 


X 



List of Tables 


2.1 Comparison of verified microprocessors 15 

2.2 HOL Infix Operators 19 

2.3 HOL Binders 19 

2.4 HOL Type Operators 20 

3.1 Basic types for interpreter definition 38 

5.1 Symbols in the Register Transfer Language 64 

5.2 The program status word 65 

5.3 The AVM-1 instruction set 67 

5.4 Jump codes for the JMP instruction 68 

5.5 Synthesizing addressing modes using AVM-1' s 75 

load and store instructions. 

5.6 Synthesizing instructions using AVM-1’ s instruc- 75 

tion set. 

5.7 Opcode breakdowns for AVM-1' s instruction set 76 

5.8 Implementation of the jump codes for the JMP 78 

instruction, cf is the carry flag in the PSW, zf 

is the zero flag, etc. 

5.9 The microinstruction format for AVM-1 82 

5.10 Comparison of verified microprocessors and AVM-1 85 

5.11 The functions used to instantiate the abstract represen- 118 

tation of the generic interpreter theory for the phase- 

level. 

5.12 The microinstruction format for AVM-1 120 

5.13 Register mnemonics for the microassembler 121 

5.14 Shifter mnemonics for the microassembler 121 

xi 


5.15 ALU mnemonics for the microassembler 122 

5.16 Program status word mnemonics for the mi- 122 

croassembler. 

5.17 External signal mnemonics for the microassem- 123 

bier. 

5.18 Microprogram counter mnemonics for the mi- 124 

croassembler. 

5.19 The functions used to instantiate the abstract represen- 132 

tation of the generic interpreter theory for the micro- 

level. 

5.20 The functions used to instantiate the abstract represen- 138 

tation of the generic interpreter theory for the macro- 
level. 

5.21 The functions used to instantiate the abstract represen- 143 

tation of the generic interpreter theory for the phase- 

level. 

5.22 The functions used to instantiate the abstract represen- 151 

tation of the generic interpreter theory for the micro- 
level. 

5.23 The functions used to instantiate the abstract represen- 162 

tation of the generic interpreter theory for the macro- 
level. 

B.l Script run-times on a SPARCStation with 16M of memory 195 


xii 



Chapter 1 


Introduction 


If we do not succeed in solving a mathematical problem, 
the reason frequently consists in our failure to recognize 
the more general standpoint from which the problem before us 
appears only as a single link in a chain of related problems. 
After finding this standpoint, not only is this problem frequently 
more accessible to our investigation, but at the same time, 

we come into possession of a method 
which is applicable to related problems. 

— David Hilbert — 


Computers are being used with increasing frequency in areas where the correct 
implementation of the computer hardware is critical. These include: 

• Safety-critical applications where the computer is directly involved in the 
control of systems that maintain human life. A flight control system on an 
aircraft or the control system in a nuclear power plant are examples of this 
type of application. 

• Security— critical applications where the computer is used to process informa- 
tion that is economically or politically sensitive. Almost any computer used 
in government or industry falls into this category to one degree or another. 

• Mass produced consumer goods where the computer is an integral part of the 
product and a mistake in the design or implementation could result in product 
recalls costing enormous amounts of money. 

In these and other applications it is vital that the computer system be correct. 

There are two complimentary approaches to computer correctness: fault tolerance 
and fault exclusion. The former is most useful in handling dynamic faults occurring 
during system operation due to component failure or other unexpected events. The 
latter is a static process intended to remove errors in design and implementation 
before the computer system is in service. 

Testing is an example of a fault exclusion technique. Testing can be divided into 
two distinct kinds. Implementational testing, which is used to verify that a physical 



device is implemented correctly, and functional testing which is used to verify that 
a design functions as the designer intended. Because it is impossible to exhaus- 
tively testing a computer system, formal verification is an attractive alternative to 
functional testing. 

Formal verification requires at least two descriptions of a system: one of its im- 
plementation and one of its specification. Correctness is shown by demonstrating 
through mathematical proof that the former implies the latter. Since verification 
entails reasoning about formal logic, producing specifications with the formality 
needed for verification is difficult. Several steps must be taken to make verification 
available and acceptable to industry: 

• Methodologies that provide a step-by-step approach to system verification 
must be produced. 

• Exemplary verified systems must be provided. 

Our goal is to make microprocessor verification tractable. To that end, this 
dissertation contains a methodology for microprocessor verification that results in 
a step-by-step approach. We also give an example showing how the methodology 
can be applied to the verification of a realistic microprocessor. 


1.1 Abstraction. 


Abstraction is the suppression of irrelevant detail or information. Uses of abstrac- 
tion abound in everyday life. For example, a map is an abstraction of the area it 
represents. The irrelevant details of the area being mapped are suppressed so that 
the map’s users are not confused or hampered by unnecessary facts. 

Abstraction is a key concept in both mathematics and computer science. Using 
abstraction, we make complex models more tractable, avoid repeating work, and 
develop methods for solving general problems. For example, procedural and data 
abstraction are used frequently in software engineering to ease the burden of pro- 
gramming by suppressing detail, providing reusable structures, and giving general 
algorithms for computational problems. 

Naturally, abstraction has a place in formal verification as well. Melham [Mel88] 
provides an important discussion of structured, behavioral, data, and temporal ab- 
straction in the verification of computer hardware. Structural abstraction sup- 
presses detail about the internal structure of a device. A behavioral abstraction is 
a partial specification of a device; the specification may leave out timing details or 
other functionality not considered important. Data abstraction suppresses imple- 
mentation details of a data type so that only its functionality is visible. Temporal 
abstraction relates different views of time in the specification of a device. 
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Melham discusses the use of abstraction in hardware verification in general; we 
will concentrate on the application of abstraction to modeling and verifying micro- 
processors. We ask the questions: 

1. Are there particular forms of behavioral and structural abstraction that are 
more efficacious in the verification of microprocessors than others? 

2. Can we formalize a general model that incorporates the behavioral, data, and 
temporal abstractions used in microprocessor verification so that they can be 
easily reused? 

As we will see in the chapters that follow, we believe that the answer to both of 
these question is yes and we will describe a hierarchical decomposition strategy 
and a generic interpreter model that make the verification of large microprocessors 
practical. 


1.2 Main Ideas. 


This section introduces the main ideas in this paper. These concepts will be 
discussed in detail in later chapters. 


1.2.1 Hierarchical Decomposition. 

As we mentioned, verification requires at least two formal descriptions of the com- 
puter system: one behavioral, B, and one structural, S. Verification consists of 
showing through formal proof techniques that 

S => B 

One need not be limited, of course, to one level of abstraction. Supposing that Bj 
through B n represent increasingly abstract specifications of the system’s behavior, 
one could verify its correctness by proving 

S = ^ > Bi — ^ B„ 

Figure 1.1 shows how this principle can be applied to the specification of a micro- 
programmed microprocessor. At the bottom of the hierarchy is the usual structural 
specification of the electronic block model. This specification describes the com- 
puter’s implementation; that is, the connections among its various components. At 
the top is the behavioral specification corresponding to the programmer’s model of 


3 



Figure 1.1: A microprocessor specification can be decomposed hierarchi- 
cally. 

the microprocessor. In between these are two additional abstraction levels: one for 
the microcode interpreter and one specifying the phase (or subcycle) behavior. 

Hierarchical decomposition plays an important role in the methodology for ver- 
ifying microprocessors presented in this dissertation. The use of a hierarchical 
decomposition can lead to significant reductions in the amount of effort used to 
structure and complete a correctness proof. 


1.2.2 Generic Interpreters. 

With one exception, each of the levels in the specification hierarchy shown in Fig- 
ure 1.1 has the same structure. The bottom level specification is a structural de- 
scription; but, the other specifications all share a common structure. Each of the 
abstract behavioral descriptions can be specified using an interpreter model. 

Perhaps the most distinguishing feature of an interpreter is that it has a flat 
control structure. One of n instructions is chosen based on the current state. The 
chosen instruction operates on the state and the cycle begins anew. There are 
a large number interesting computer systems that have a flat control structure: 
microprocessors, operating systems, language interpreters, and editors are a few. 

Since each of the behavioral descriptions in the specification hierarchy are similar, 
we would prefer to develop a general model of an interpreter and use this model in 
our specification rather than treating each level in the hierarchy separately. We can 
ask several interesting questions about the interpreter model: 
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• How can one interpreter be used to implement another interpreter? 

• Can we formalize the data and temporal abstractions between these levels? 

• What, if anything, can we say about the correctness of an interpreter’s imple- 
mentation? 

• Can we formalize the model in a verification environment so that it can be 
easily reused in verifying microprocessors? 

The chapters that follow provide the answers to these questions. 


1.2.3 Composition of Verified Modules. 

Verified systems can be constructed using verified components. The composition of 
verified components that share state is a topic that has not received much attention. 
Indeed, most of the microprocessor verifications done in the past have assumed tl it 
the CPU was the sole user of memory — even when the CPU’s designer claimed that 
input/output was memory mapped. 

In this dissertation we take a first step toward the specification of computer 
components that share state with other devices. We defined the concept of shared 
state and propose mechanisms for specifying the reading of and writing to shared 
state. The assumptions on the final proof of correctness clearly state how conflicts 
regarding shared state are resolved. 


1.3 Contributions. 

The work described in this dissertation makes the following contributions: 

1. The hierarchical decomposition strategy provides a firewall for the structural 
complexity of the electronic block model specification well below the large case 
explosion that occurs at the top-level. This firewall results in a substantial 
savings in effort over past specification methods since the large number of 
cases in the upper levels can now be handled in a regular, largely automatic 
manner. 

2. Generic proofs formalize a methodology for verifying microprocessors. The 
generic interpreter proof clearly states what definitions need to be made and 
what lemmas need to proven about these definitions in order to verify a micro- 
processor. This is in sharp contrast to previous microprocessor verifications 
where the specification and verification proceeded on an ad hoc basis. 
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3. Our technique for specifying components with shared state decreases the se- 
mantic gap between what the designer intends and what the specification 
says. Our CPU specification recognizes that other components may change 
the contents of memory and other shared registers. The proofs that result 
from these specifications have very satisfying interpretations with respect to 
the assumptions that indicate how conflicts over shared state are resolved. 

In addition to the major benefits listed above, there are a number of other benefits 
that result from our work: 

• The generic theory can be instantiated, resulting in the reuse of large pieces 
of the generic proof. 

• Temporal and data abstraction are handled completely within the generic 
theory freeing the user from proving theorems about these abstractions. 

• The generic proofs show exactly what has been proven. There is no superfluous 
detail cluttering up the definitions and theorems. 

• Our interpreter model recognizes the environment and treats it separately 
from the state. 


1.4 Formal Proofs. 


The paper deals with the formal verification of generic interpreters. What exactly 
is implied by the word formal ? 

The word formal is used to describe many things in mathematics: formal systems, 
formal logics, formal proofs, and so on. A formal object is one where rigor is 
maintained through a methodical treatment. 

In a formal system, great emphasis is placed on syntax (i.e. the form). In a formed 
logic, for example, the syntax of the logic is set forth unambiguously and inference 
rules for manipulating the syntax are clearly defined. A formal proof in this logic 
takes place syntactically through the application of inference rules in a sequential 
manner. The use of inference rules to transform terms syntactically helps keep the 
prover’s semantic biases from creeping into the proof. 

The behavioral and structural models for computer systems can be very large. 
Proofs of correctness for microprocessors have, in some cases, been done using 
paper and pencil. Usually, however, the proofs are so large that some form of 
mechanical proof support is necessary to maintain the required rigor. In one case, 
the formal verification of a microprocessor found errors in a completed informal 
proof [Coh88b]. Most proofs involving mechanical theorem provers are done using 
some sort of formal logic since formalization is a prerequisite to mechanization. 
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In a strictly proof theoretic sense, the proofs in this paper are not formal. A formal 
proof is like a number — an abstract object that can never be represented in the 
physical world. Numerals are not numbers, but merely represent them. While in a 
strict mathematical sense formal proofs are impossible to express, the term formal 
verification is used ubiquitously in the verification community to mean a proof in a 
formal logic, usually with the help of some sort of mechanized theorem prover. We 
use the term formal in this sense. 


1.5 Notation and Conventions. 

Our notation will be that of standard logic with a few extensions: 

• Terms in the logic will be written in typewriter font. 

• Conjunction, disjunction, negation, implication, universal quantification, ex- 
istential quantification, and lambda abstraction use the usual symbols: A, V, 
->, =>, V, 3, and A respectively. 

• We use a conditional operator that is written a —* b | c, meaning “if a, 
then b, else c.” 

• Definitions will be denoted with a prepended I “*/. 

• Ter ms that have been formally proven in the logic will be prepended with h . 
Other notations and logical expressions will be explained as they are used. 


1.6 Chapter Summaries. 


This document begins, in Chapter 2, with a discussion of related work. The idea 
of using abstract representations of theories is not new to our research. There are 
several specification and theorem proving systems that support generic modules and 
several examples of their use in the literature. In addition, a number of microproces- 
sors have been specified and verified in formal systems. Most of these verifications 
used an implicit interpreter model for the behavioral specification. 

Our research is, we believe, the first where the interpreter model has been formal- 
ized. Chapter 3 discusses our model of interpreters, gives a mathematical definition 
of an interpreter, and defines what it means to verify one interpreter in terms of 
another. The chapter also discusses how hierarchical decomposition can be used 
in the specification of microprocessors and discusses the composition of verified 
components that share state. 
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This theme is extended in Chapter 4 where we show how the mathematical defi- 
nition can be formalized in the HOL verification system. We present two different 
models: a synchronous interpreter model and an asynchronous model. 

Chapter 5 contains an example of the use of the generic interpreter theory in 
specifying and verifying a microprocessor, AVM-1. AVM-1 is designed to serve as 
a testbed for the concepts in this dissertation. The architecture and organization 
of AVM-1 are described, the formal specification is presented, and the verification 
is discussed. 

Appendix A provides a description of the ML package developed for using generic 
theories in HOL. 

Appendix B presents the technical details of the AVM-1 proof. The theory hier- 
archies are discussed and the run times for the proof scripts of the various theories 
constituting the verification of AVM-1 are presented. 
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Chapter 2 


Previous Work 


This chapter is divided into four sections. The first section discusses research in the 
verification of sequential machines and its relation to our work. The second section 
discusses previous microprocessor verifications where a model similar to the one 
formalized in this dissertation was employed. The third section describes related 
work in generic theories. The last part of this section describes how hardware 
behavior and structure are specified in a formal logic. 


2.1 Sequential Circuit Verification. 


Reasoning about and verifying sequential circuits is a topic that has generated much 
research interest due to the inherent difficulty involved in reasoning about state. The 
standard engineering formalism taught in undergraduate switching theory classes 
is useful for reasoning about small state machines; but when the number of states 
increases the model suffers from exponential case explosion. In this section we will 
discuss some of the approaches to the problem. 

Sequential machines have been studied for decades. The work in this disserta- 
tion is similar in spirit to Gordon’s work [Gor80] in the denotational semantics of 
sequential machines and Plotkin’s state transition systems [PI 08 I]. Our goal is, 
however, not to simply describe sequential machines, but to verify them. 

Several researchers have developed special purpose languages for describing and 
reasoning about sequential machines. Browne and Clark [BC87] have developed a 
high-level language, called SML , for describing finite state machines. SML is based 
on a temporal logic semantics. The state transition table generated from an SML 
program can be fed to a temporal logic verifier that allows some properties of the 
state machine to be verified. 

Bronstein and Talcott [BT89] have developed a string-functional semantics of 
synchronous sequential logic and have applied it to the verification of pipe— fines 
and systolic arrays. The theory is based on finite rather than infinite arrays and 
thus cannot reason about asynchronous circuits. Because of the finite property of 
the underlying semantics, the theory can be developed and used in a first-order 
system such as the Boyer-Moore theorem prover [BM79]. 

Loewenstein has developed a theory of state machines in HOL [Loe89]. The 
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distinguishing feature of Loewenstein’s work is that the theory is developed in the 
same theorem proving environment that he uses to reason about his state machines. 
The theory contains theorems that define both deterministic and non-deterministic 
state machines and derives lemmas that state what it means for one state machine 
to implement another and what it means for two state machine to be equivalent. 
Loewenstein’s theory is similar to the model that we will define in Chapter 3. The 
primary difference is that Loewenstein’s model does not formalize temporal and 
data abstraction between a state machine and its implementation. 

SDVS (State Delta Verification System) was originally developed and described 
by Crocker in [Cro77]. A state-delta is a temporal logic formula that describes the 
changes in the globed state of a machine over a time interval. SDVS is currently 
under development at the Computer Science Laboratory of the Aerospace Corpo- 
ration [MCL84,Mar87]. The original goal of SDVS was to provide a usable system 
for proving the correctness of microcode expressed in the ISPS register transfer 
language. SDVS was also used to reverify Hunt’s FM8501 (see Section 2.2.2). 

The next section discusses the work in state machine verification that most closely 
resembles our own work. 


2.2 Microprocessor Verification. 


There have been numerous efforts to verify microprocessors. Many of these have 
used the same implicit behavioral model. We will first describe this implicit model 
and then describe the microprocessor verifications that use it. 

In general, the model uses a state transition system to describe the microproces- 
sor. The microprocessor specification has four important parts: 

1. A representation of the state, S. This representation varies depending on the 
verification system being used. 

2. A set of state transition functions, J, denoting the behavior of the individual 
instructions of the microprocessor. Each of these functions takes the state 
defined in step (1) as an argument and returns the state updated in some 
meaningful way. 

3. A selection function, N, that selects a function from the set J according to 
the current state. 

4. A predicate, I, relating the state at time t + 1 to the state at time t by means 
of J and N. 

In some cases, the individual state transition functions, J, and the selection func- 
tion, N, are combined to form one large state transition function. Also, a functional 
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specification would use a function for part (4) instead of a predicate. The specifi- 
cations, however, are largely the same. 

After the microprocessor has been specified, we can verify that a machine de- 
scription, M, implements it by showing 

Vs € S. M(s) =>■ I(s). 

That is I has the same effect on the state, s, that M does. This theorem is typically 
shown by case analysis on the instructions in J by establishing the following lemma: 

Vj € J. M(a) =*► (Vt: time. C(j, s, t) = 4 - a(t + rij) = j(a(i))) 

where C is a predicate expressing the conditions for instruction j's selection, s(t) is 
the state at time t, and n : is the number of cycles that it takes to execute j. This 
lemma says that if an instruction j is selected, then applying j to the current state 
yields the state that results by letting the implementing interpreter M run for n } 
cycles. We call this lemma the instruction correctness lemma. 

The remaining parts of this section describe microprocessor verifications where 
some variation of this general model was used. 


2.2.1 Tamarack 

Tamarack is a small microcoded microprocessor that has been verified by Jeffrey 
Joyce at the University of Cambridge [Joy89a,Joy88]. Joyce has verified Tamarack 
to the transistor level using HOL and has fabricated an 8— bit version of the design 
in CMOS. In addition to verifying the microprocessor, Joyce has also verified a 
compiler for Tamarack [Joy89b]. 

Tamar ack is a 16-bit computer with a 13-bit address space. The computer has 
8 instructions: halt, jump, jump if zero, add, subtract, load, store, and skip (or no 
operation). The architecture has an accumulator and a program counter visible to 
the assembly language programmer in addition to the memory. The computer is 
implemented in microcode and has a single bus connecting each of the blocks in the 
electronic block model. The microstore is 32 microwords long. 

Tamarack is based on a computer designed and verified using LCF-LSM by Mike 
Gordon [Gor83]. Daniel Weise verified Gordon’s design using a Lisp-based system 
called Silica Pithecus [Wei86] and Harry Barrow verified it using a system called 
VERIFY [Bar84], making this the most widely verified microcomputer design. 

The specification and verification of Tamarack corresponds closely to the general 
model developed at the beginning of this section. The macro-level specification 
denotes what each instruction does and ties the descriptions of each instruction 
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together with a predicate stating the relation between the state at time t and time 

t + 1. 

The verification of Tamarack is enlightening since it has been done many times 
with many different verification systems and using many levels of abstraction. 
Tamarack is, however, small and research is needed to discover methods for scal- 
ing the Tamarack experience to larger microprocessors, including those with larger 
instruction sets and support for operating systems. 


2.2.2 FM8501. 

FM8501 is a microprocessor designed and verified by Warren Hunt using the Boyer- 
Moore theorem prover [Hun87]. The architecture has a register file containing eight, 
16-bit registers, a 64K-byte memory space, 26 instructions, and four memory ad- 
dressing modes. FM8501 models memory as an asynchronous process. The imple- 
mentation is microcoded and has a microstore of 16 microwords. 

The specification of FM8501 consists of two recursive functions: one for the be- 
havioral specification and one for the implementation. The functions recurse at 
each clock cycle, computing a new state. Time and the asynchronous inputs to 
the CPU are modeled by an oracle. The oracle is represented by a list; it is this 
fist that the specifications recurse on. Time is represented by the current position 
of the recursive specification in the list. Each member of the list gives whatever 
asynchronous inputs may exist at that time. The proof shows the equivalence of 
the two recursive functions using an abstract (uninterpreted) oracle f un ction 

Crocker et al. reverified FM8501 using a specification written in ISPS in the 
SDVS verification system [CCL088]. The reverification is significant because the 
work used no part of Hunt’s work directly and thus represents an independent 
verification of the design using a different verification system. 

On the surface, the verification of FM8501 appears quite different than the verifi- 
cation of Tamarack, but in fact, they are very similar. The methods of specification 
for the top-level can be seen as an instance of the general model presented at the 
beginning of this section. The verification, even though done on a functional speci- 
fication in a first-order system, uses the a form of the instruction correctness lemma 
to show that the electronic block model implements the top-level specification. 


2.2.3 VIPER. 

VIPER was designed by Britain’s Royal Signals and Radar Establishment (RSRE) 
at Malvern to provide a formally verified microprocessor for use in safety critical 
applications. VIPER’s designers chose not to include a stack and interrupts — an- 
ticipating that they might lead to difficulties in the verification of software running 
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on the VIPER. The machine was designed to halt on errors and raise an external 
exception. The fabrication was carried out by two separate manufacturers and is 
commercially available. 

VIPER is the first microprocessor available for commercial use where formal veri- 
fication was used. As we will see, the verification was not completed. While VIPER 
is significantly simpler than today’s general purpose microprocessors, its verification 
provides a benchmark on the stat^of-the-art in microprocessor verification. 

VIPER has a 20-bit program counter, a 32-bit general purpose accumulator, and 
two 32-bit index registers. VIPER has a single instruction format that allows the 
user to select a source register, one of four memory addressing modes, one of eight 
destinations, whether or not to compare, and one of sixteen ALU functions. In ad- 
dition to the fields just mentioned, each instruction contains a 20-bit address. The 
VIPER design is described in detail in [Cul88]. The implementation is hardwired 
instead of being microcoded. 

The combination of fields in the instruction format (excluding source and des- 
tination selections) yields 128 different instruction cases. Recent research on the 
VIPER design [Aro90] has characterized the VIPER instruction set using only 20 
instructions. As we will see, this is an important distinction that bears on the 
difficulty of verifying VIP ER. 

The specification of VIPER is hierarchical. The top-level specification of VIPER 
is similar to that of [Joy89a], The next level of the specification is called the major- 
state machine and is a description of VIPER’s major states. The next level in 
the specification is the electronic block model. The top two levels were specified 
first in LCF-LSM and later in HOL. The electronic block model was specified in 
HOL. Below the electronic block model the circuit was described using a hardware 
description language called ELLA and verified by “intelligent exhaustive simulation” 
[Pyg85]. 

An paper-and-pencil proof of correctness between the top-level of VIPER and 
the major-state machine was done by RSRE. Because of the complexity of the 
lower-level (electronic block model to major state machine) proof, RSRE did not 
attempt a hand proof of this level. RSRE contracted with Avra Cohn at Cambridge 
University to formalize the top-level proof and perform the lower-level proof. Cohn 
describes her formal verification of the major-state machine with respect to the top- 
level specification in [Coh88bj. 

Cohn decided to forego the proof of the top-level correspondence in trying to ver- 
ify the electronic block model since that the major-state level specification and the 
electronic block model yielded dissimilar structures under cases analysis. Rather, 
she attempted to show a direct correspondence between the top-level and the elec- 
tronic block model. The proof is described in detail in [Coh88a]. Cohn’s proof of 
this level remains incomplete because of the large case explosion that occurred and 
the size of the proofs in each of the cases. This is not to say that the proof could 
not be completed; but only at large expense. 
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It seems clear from Cohn’s experience with VIPER that abstraction is critical 
in dealing with the large case explosion that occurs in these kinds of proofs. The 
major-state machine did provide a level of abstraction in-between the top-level 
and the electronic block model, but it appears to be the wrong one. In addition, 
Cohn had almost no access to VIPER’s designers and thus had little or no help 
in deciphering and understanding the informal specification of the electronic block 
model. 


2.2.4 SECD. 


Brian Gra h a m et al. at the University of Calgary have undertaken the implemen- 
tation and verification of the SECD machine [GB89]. The SECD machine is an 
abstract Lisp machine invented by Landin to reduce lambda expressions [Lan64]. 
The variant of SECD implemented by Graham is described in [Hen80]. Graham’s 
work is part of a larger effort at the University of Calgary to verify a complete 
system including a LispKit compiler as well as the SECD chip. 

The architecture has four registers, called S, E, C, and D. The S register holds 
a stack pointer, the E register holds a pointer to the environment, the C register 
functions as a program counter, and D points to a stack used to dump the state of 
the machine. There are approximately 20 instructions and the implementation is 
microcoded. 

The remarkable thing about the SECD proof is that even though the architecture 
is specialized, the specifications and proofs are done in a manner very similar to the 
proofs of the more conventional architectures described in the last three sections. 
The behavioral model corresponds to the general model described at the beginning 
of this section. The top-level specification is based on state-transitions and the 
description of the electronic block model is a predicate-based circuit description 
similar to both [Joy89a] and [Coh88a]. The garbage collection mechanism is imple- 
mented in hardware and the proof was done without taking it into account. Work 
is in progress on a second proof that verifies the garbage collection hardware and a 
second implementation. 


2.2.5 Comparison. 

Table 2.1 summarizes the designs of the four microprocessors presented in this 
section. The table, like all such tabulations, cannot hope to capture all of the im- 
portant characteristics of the microprocessors, but the data presented does provide 
some basis for judging relative complexities. 
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Table 2.1: Comparison of verified microprocessors. 



Tamarack 

FM8501 

Viper 

SECD 

User Registers 

2 

8 

4 

4 

Instructions 

8 

26 

20 ' 

21 

Microcoded 


yes 

no 

yes 

Microstore size 

32 words 

16 words 

N/A 

512 words 

Interrupts 

yes 

no 

no 

no 

Memory Model 

async 

async 

sync 

sync 

Word Width 

16-bit 

16-bit 

32-bitl 

32-bit 

Memory Size 

8K 

64K 

1M 

16K 


2.3 Generic Theories. 


Generic theories provide structures to support theorem reuse. Generic theories are 
similar in spirit to generic modules in programming languages such as Ada [Ada83]. 
Even accounting for the obvious differences between Ada as a programming language 
and our use of generics in a verification environment, however, we shall see that the 
notion of a generic theory is stronger than that of a generic module in Ada. 

An generic theory consists of three parts: 

1. An abstract representation of the uninterpreted constants and types in the 
theory. 

2. A list of theory obligations defining relationships between members of the 
abstract representation. 

3. A collection of abstract theorems about the representation. 

The abstract representation contains a set of abstract operations and a set of 
abstract objects. An abstract object does not necessarily need to be specifically 
declared, but can be declared through use. The semantics of the abstract represen- 
tation are unspecified; that is, we don’t know (inside the theory) what the objects 
and operations mean. 

The theory obligations are a set of predicates. Inside the theory, the obligations 
represent axiomatic knowledge about the abstract representation. Using the obli- 
gations as axioms allows us to prove theorems of interest about the abstract objects 
and operations. Outside the theory, the obligations represent the criteria that a 
concrete representation must meet if it is to be used to instantiate the abstract 
theory. 

The theory obligations represent the largest difference between generic theories 
and the generics of Ada. If we view the generic portion of our theory as the interface, 
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the abstract representation can be thought of as the syntax of the interface. The 
abstract representation corresponds to the declaration of the generic parameters in 
an Ada module. The theory obligations denote the semantics of the interface and 
Ada provides no corresponding structure. 

The abstract theorems are a body of facts concerning the abstract objects. Usu- 
ally, the theorems are based on the theory obligations and can stand alone only 
after the theory obligations have been met. 

Our goal is to instantiate the generic theory with a concrete representation. To 
effect the instantiation, the concrete representation must meet the syntactic re- 
quirements of the abstract representation as well as the semantic requirements of 
the theory obligations. If the syntactic and semantic requirements are met, then 
the instantiation provides a collection of concrete theorems about the new repre- 
sentation. 

Several specification and verification systems support generic theories. Some, 
such as OBJ and EHDM, offer explicit support. HOL, the verification environment 
used in the research reported here, does not explicitly support generic theories; 
however, HOL’s metalanguage, ML, combined with higher-order logic, provides a 
framework sufficient for implementing generic theories. 


2.3.1 OBJ. 


OBJ is a specification and programming language developed by Joseph Goguen et 
al. that has most recently been described in [GW88]. OBJ is widely known and the 
semantics of its theories and views match our use of generic theories much more 
closely than do Ada generics. 

OBJ is based on a many-sorted (or typed) algebraic semantics and supports 
parameterized specification and programming [Gog84]. OBJ has three kinds of 
entities: 

1. Objects, which are concrete modules that encapsulate executable code, 

2. Theories, which are parameterized modules that correspond to generic the- 
ories as used in this dissertation, and 

3. Views, which bind objects and theories to parameters in another theory. 

Objects are said to contain executable code because the expressions in an object 
module give the initial algebraic semantics of the sorts and operations being defined. 
The fact that their semantics is initial implies that they describe just one model (up 
to isomorphism). Theories, on the other hand, are said to have a “loose” semantics 
since they define a variety of models. A loose semantics describes a class of objects; 
any member of that class will satisfy the theory. 
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A view is not an instantiation. Instantiation is done using a special command, 
make, after the view has been established. A view can be seen as a mapping of 
the operators and objects from one module onto a theory, as well as a declaration 
of intent that the module meets the obligations set forth in the equations of the 
theory module. OBJ does not require that the user prove that the obligations are 
met — a simple declaration is sufficient. Of course, if the view is not proper, then 
the OBJ program will not operate as intended. 


2.3.2 EHDM. 

EHDM is a specification and verification system that is being developed by SRI 
International [EHD88]. The language of EHDM is based on first-order predicate 
logic, but includes some elements of higher— order logic as well. For example, vari- 
ables can range over functions, functions can return other functions, and functions 
can appear in quantifications. Parameterized modules are an important part of 
the EHDM language where they are used to organize specifications. Modules can 
be parameterized with types, constants, and functions. The module parameters 
can have constraints placed on them that must be met before the module can be 
instantiated. 

In EHDM, a parameterized module is called a generic module and an instantia- 
tion is called a module instance. EHDM module declarations give the uninterpreted 
types, constants, and functions over which the module is parameterized. This dec- 
laration is analogous to our abstract representation. 

The module body contains (among other things) an ASSUMING clause that 
gives the properties of the module parameters. The formulae in the ASSUMING 
clause are analogous to our theory obligations. 

The module can also contain declarations of concrete types, constants, and func- 
tions that define the theory associated with the module and proofs of theorems 
about the abstract operations in the theory. These proofs may rely on the formulae 
in the ASSUMING clause. 


2.3.3 HOL. 

HOL is a general theorem proving system developed at the University of Cambridge 
[Gor88,CGM87] that is based on Church’s theory of simple types, or higher-order 
logic [Chu40]. Church developed higher-order logic as a foundation for mathematics, 
but it can be used for describing and reasoning about computational systems of 
all kinds. Higher— order logic is similar to the more familiar predicate logic, but 
allows quantification over predicates and functions, not just variables, allowing more 
general systems to be described. 
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HOL grew out of Robin Milner’s LCF theorem prover [GMW79] and is similar to 
other LCF progeny such as NUPRL [Con86]. Because HOL is the theorem proving 
environment used in the body of this work, we will describe it in more detail. 

HOL s proof style can be tailored to the individual user, but most users find 
it convenient to work in a goal-directed fashion. HOL is a tactic based theorem 
prover. A tactic breaks a goal into one or more subgoals and provides a justification 
for the goal reduction in the form of an inference rule. Tactics perform tasks such 
as induction, rewriting, and case analysis. At the same time, HOL allows forward 
inference and many proofs are a combination of both forward and backward proof 
styles. Any theorem proving strategy a user employs in connection with HOL is 
checked for soundness, eliminating the possibility of incorrect proofs. 

HOL provides a metalanguage, ML, for programming and extending the theorem 
prover. Using the metalanguage, tactics can be put together to form more powerful 
tactics, new tactics can be written, and theorems can be combined into new theories 
for later use. The metalanguage makes the HOL verification system extremely 
flexible. 

In HOL, all proofs, even tactic-based proofs, are eventually reduced to the appli- 
cation of inference rules. Most non-trivial proofs require large numbers of inferences. 
Proofs of large devices such as microprocessors can take man y millions of inference 
steps. In a proof containing millions of steps, what kind of confidence do we have 
that the proof is correct? One of the most important features of HOL is that it 
is secure, meaning that new theorems can only be created in a controlled manner. 
HOL is based on 5 primitive axioms and 8 primitive inference rules. All of high- 
level inference rules and tactics do their work through some combination of the 
primitive inference rules. Because the entire proof can be reduced to one using only 
^ primitive inference rules and 5 primitive axioms, an independent proof checking 
program could check the proof syntactically. 


2.3.3.1 The Language. 

The object language of HOL is described in this section. We will discuss HOL’s 
terms and types. 


Terms. All HOL expressions are made up of terms. There are four kinds of 
terms in HOL: variables, constants, function applications, and abstractions (lambda 
expressions). Variables and constants are denoted by any sequence of letters, digits, 
underlines and primes starting with a letter. Constants are distinguished in the 
logic; any identifier that is not a distinguished constant is taken to be a variable. 
Constants and variables can have any finite arity, not just 0, and thus can represent 
functions as well. 

Function application is denoted by juxtaposition, resulting in a prefix syntax. 
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Operator 

Application 

Meaning 

* 

tl * t2 

1 1 equals t2 

t 

tl ,t2 

the pair tl and t2 

A 

tl A t2 

tl and t2 

V 

tl V t2 

tl or t2 


tl =3- t2 

tl implies t2 


Table 2.2: HOL Infix Operators 


Binder 

Application 

Meaning 

V 

V x. t 

for all x, t 

3 

3 x. t 

there exists an x such that t 

e 

e x . t 

choose an x such that t is true 


Table 2.3: HOL Binders 


Thus a term of the form "t 1 t2" is an application of the operator 1 1 to the operand 
t2. Its value is the result of applying tl to t2. 

An abstraction denotes a function and has the form "A x. t". An abstraction 
"A x. t" has two parts: the bound variable x and the body of the abstraction t. It 
represents a function, f, such that ”f(x) * t". For example, "A y. 2*y" denotes 
a function on numbers which doubles its argument. 

Constants can belong to two special syntactic classes. Constants of arity 2 can 
be declared to be infix. Infix operators are written "randl op rand2" instead of 
in the usual prefix form: "op randl rand2". Table 2.2 shows several of HOL’s 
built-in infix operators. 

Constants can also belong another special class called binders. A familiar example 
of a binder is V. If c is a binder, then the term "c x.t" (where x is a variable) is 
written as shorthand for the term "c(A x. t)". Table 2.3 shows several of HOL’s 
built-in binders. 

In addition to the infix constants and binders, HOL has a conditional statement 
that is written a — ► b | c, meaning “if a, then b, else c.” 


Types. HOL is strongly typed to avoid Russell’s paradox and others like it. Rus- 
sell’s paradox occurs in a high-order logic when one can define a predicate that leads 
to a contradiction. Specifically, suppose that we define P as P(x) * -*x(x) where 
-i denotes negation. P is true when its argument applied to itself is false. Applying 
P to itself leads to a contradiction since P(P) * ->P(P) (i.e. true = false). This 
kind of paradox can be prevented by typing since, in a typed system, the type of P 
would never allow it to be applied to itself. 

Every term in HOL is typed according to the following recursive rules: 
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Operator 

Arity 

Meaning 

bool 

0 

booleans 

ind 

0 

individuals 

num 

0 

natural numbers 

(*)list 

1 

lists of type * 

(*,**) prod 

2 

products of * and ** 

(*,**)sum 

2 

coproducts of * and ** 

(* ,**)fun 

2 

functions from * to ** 


Table 2.4: HOL Type Operators 


• Each constant or variable has a fixed type. 

• If x has type a and t has type (3, the abstraction A x . t has the type (a -*■ 13). 

• If t has the type (a — ► (3) and u has the type a, the application t u has the 
type /3. 

Types in HOL are built from type variables and type operators. Type variables 
are denoted by a sequence of asterisks (*) followed by a (possibly empty) sequence of 
letters and digits. Thus *, ***, and *ab2 are all valid type variables. All type vari- 
ables axe universally quantified implicitly, yielding type polymorphic expressions. 

Type operators construct new types from existing types. Each type operator has 
a name (denoted by a sequence of letters and digits beginning with a letter) and an 
arity. If . . . , <r n are types and op is a type operator of arity n, the (< 7 ! , . . . , <r n )op 
is a type. Note that type operators are postfix while normal function application is 
prefix or infix. A type operator of arity 0 is a type constant. 

HOL has several built-in types which are listed in Table 2.4. The type operators 
bool, ind, and fun are primitive. HOL has a special syntax that allows (* , **)prod 
to be written as (* # **), (*,**) sum to be written as (* + **), and (*,**) fun 
to be written as (* -> **). 


2. 3. 3. 2 The Proof System. 

HOL is not an automated theorem prover but is more than simply a proof checker, 
falling somewhere between these two extremes. HOL has several features that 
contribute to its use as a verification environment: 

1. Several built-in theories, including booleans, individuals, numbers, products, 
sums, lists, and trees. These theories contain the five axioms that form the 
basis of higher-order logic as well as a large number of theorems that follow 
from them. 
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2. Rules of inference for higher-order logic. These rules contain not only the eight 
basic rules of inference from higher— order logic, but also a large body of derived 
inference rules that allow proofs to proceed using larger steps. The HOL 
system has rules that implement the standard introduction and elimination 
rules for Predicate Calculus as well as specialized rules for rewriting terms. 

3. A collection of tactics. Examples of tactics include REWRITE.TAC which rewrites 
a goal according to some previously proven theorem or definition, GEN.TAC 
which removes unnecessary universally quantified variables from the front of 
terms, and EQ.TAC which says that to show two things are equivalent, we should 
show that they imply each other. 

4. A proof management system that keeps track of the state of an interactive 
proof session. 

5. A metalanguage, ML, for programming and extending the theorem prover. 
Using the metalanguage, tactics can be put together to form more powerful 
tactics, new tactics can be written, and theorems can be aggregated to form 
new theories for later use. The metalanguage makes the verification system 
extremely flexible. 


2.3.3.3 Generic Theories in HOL. 

HOL provides a non-parameterized module structure called a theory. A theory is a 
set of types, definitions, constants, axioms and parent theories. Higher-order logic 
is extended by defining new theories. To use a theory, one declares it a parent of 
the current draft theory; all of the components of the parent and its ancestors are 
then available for use in the child theory. 

HOL does not explicitly support parameterized, or generic, theories and thus 
might seem a poor vehicle for the research presented in this dissertation. However, 
HOL’s other features, in particular its flexible proof style and programmability, 
make it a desirable system in which to work. We choose to use HOL and implement 
generic theories. The fact that generic theories can be defined at the user level in 
HOL without explicit support for them in the system is a testament to the flexibility 
of HOL. 

Three things axe required to implement generic theories in HOL: 

1. A way of representing abstract objects and operations. 

2. A method for declaring theory obligations and using these obligations in 
proofs. 

3. F un ctions for instantiating an abstract theory with concrete objects. 
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Our implementation of abstract theories in HOL is described in Appendix A. 

Jeffrey Joyce of Cambridge University presents a method of representing abstract 
objects and operations using HOL in [Joy89a] where he uses them to provide an 
abstract view of n-bit words. We use Joyce’s methods in our implementation of 
abstract representations. We have extended Joyce’s work by developing full-fledged 
generic theories including theory obligations and methods of instantiating a generic 
theory. 

HOL has a type polymorphic logic that supports top-level universal quantification 
over type variables; we use type variables to denote abstract objects. Abstract 
operations on these objects make use of HOL’s ability to quantify over functions. 
Abstract operations are synthesized by creating variables that hold n-tuples; each 
entry in the tuple represents one of the abstract operations. The abstract operations 
select the appropriate field from the tuple. 

We have implemented an ML function that defines a new abstract representation. 
The following example shows how our implementation can be used to define an 
abstract representation for groups. (Recall that type variables are denoted in HOL 
by prepending an asterisk.) 


let G - new_abstract_representation 

[ 

(’op’ , " : (*group X *group) — ► *group") 

f 

(’e* :*group") 

9 

( ’ inv ’ , " : *group — ► *group " ) 

];; 


The abstract representation is given as a list of pairs where the first member of the 
pair is a string giving the name of the abstract operation and the second is an HOL 
term giving its type. 1 

The theory obligations are declared by giving a list of terms; each term denotes 
a predicate that will be used as an axiom inside the generic theory. Note that the 
predicates are not true axioms since we want them to exist only inside the generic 
theory; they will be satisfied and discharged by the instantiation when the theory 
is used. Continuing the group theory example from above, we present the theory 
obligations: 

1 bists in HOL take the form [x i ; . . . ; x n ] . Strings are enclosed in backquotes. HOL types always 
begin with a colon. 
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new.theory.obligations 

["V g:*group. (op rep (g,e rep) = g)"; 

"V g:*group. (op rep (e rep.g) * g)"; 

"V g:*group. (op rep (g,inv rep g) * (e rep))"; 

"V g:*group. (op rep (inv rep g,g) * (e rep))"; 

"V g g’ g” : * group . 

op rep (g, op rep (g\g”)) = op rep (op rep (g,g’), g”)" 

J » f 


The abstract operations op, e, and inv are all used as selectors on a variable called 
rep. Recall that function application in HOL is denoted by juxtaposition. The 
variable rep is a 3-tuple in this case; when the theory is instantiated rep will 
be replaced with a tuple containing three concrete functions. The five obligations 
given in the above example state the usual group theoretic requirements that e be 
an identity element, that inv be an inverse, and that op be associative. 

Using the abstract representations and the obligations, we can prove theorems 
from group theory. For example, we can show that left cancellation holds, that the 
identity is unique, and that inverse reverses itself: 


LEFT.CANCELLATION = 
h V x y a:*group. 

(op rep (a.x) = (op rep (a,y))) => (x = y) 

I DENTI TY_UN I QUE = 
h V f: *group . 

(V a:*group. (op rep(a,f) = a) A (op rep(f,a) = a)) => 
(f = (e rep)) 

INVERSE. INVERSE.LEMMA = 

I- V a:*group. (inv rep (inv rep a)) * a 


We can instantiate a generic theory by giving 

• the name of the generic theory, 

• a list of theorems showing that our instantiation meets the obligations for the 
generic theory, 

• a list of mappings from variables in the generic theory to concrete objects in 
the instantiation, and 

• a string that will be prepended to the names of the generic theorems to make 
them unique and prevent name clashes. 

For example, if we have defined exclusive-disjunction and proven the corresponding 
group theory obligations, we can instantiate the generic theory for groups as follows: 


23 





let theorem. list ~ 

instantiate.abstract.theorems 
’ group ’ 

[LEFT.IDENT; RIGHT. I DENT ; LEFT. INVERSE; 
RIGHT.INVERSE; XOR.ASSOC] 

[("rep" ,"(X0R, F, XOR.INV)")] 

’XOR’ ; ; 


This gives a list of all of the theorems in the generic theory specialized for our theory 
of exclusive-disjunction; 


XOR.LEFT. CANCELLATION = 

I- V x y a. (X0R(a,x) ■ X0R(a,y)) => (x = y)) 

XOR. IDENTITY .UNIQUE = 

I- V f. (V a. (X0R(a,f ) ■ a) A (X0R(f,a) = a)) =» 
(f = F) 

XOR. INVERSE. INVERSE.LEMMA = 
h V a. XOR.INV (XOR.INV a) = a 


Note that there is no mention of any part of the abstract representation in these 
theorems and the theorems are free of the theory obligations. In fact the theorems 
appear just as they would had we proven them directly rather than inheriting them 
from the generic theory. 


2.4 Using Logic to Specify Hardware. 


A circuit is a collection of devices composed by interconnection. Each of these 
devices has ports which are used for input, output, or both. The behavior of a 
device can be expressed in terms of its ports. Each of the devices in a circuit can, 
in turn, be viewed as a composition of still other devices. This hierarchy of devices 
eventually leads to the devices that the designer considers primitive. The smallest 
devices we will deal with in this dissertation axe logic gates and indeed, in many 
cases, we will stop much higher than even gates. 

Clocksin describes several ways to specify circuit structure [Clo87]: 

• We can use imperative declarations of the circuit structure (this is referred to 
as the extensional method). 

• We can use functions to describe the output in terms of the input. 

• We can use predicates in a quantified logic to relate the ports of a device using 
behavioral or structural constraints. 
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Each of the methods has advantages and disadvantages. The extensional method 
has the advantage of being familiar to designers since it resembles imperative lan- 
guages such as Pascal that most designers have used. The disadvantage of the 
extensional method is that it is difficult to treat formally, just as traditional imper- 
ative languages are hard to treat formally. 

The functional model is widely used; Hunt’s specification of the FM8501 pro- 
cessor, for example, is functional [Hun87]. To specify the behavior of sequential 
circuits functionally, the specification language must support recursion. Hunt uses 
recursion in his specification to describe the sequential operation of his CPU. 

In the functional model, circuit interconnection is given by the syntactic structure 
of function application. This can cause several problems: 

• Describing circuits with bi-directional ports is difficult since functional spec- 
ifications differentiate between input and output syntactically. 

• The purpose of a structural specification is to show how components are con- 
nected together. Since the only means of expressing connection is function 
application, even returning a tuple is insufficient for describing circuits with 
more than one output. 

• Sequential circuits feedback on themselves. Recursion is the best alternative, 
but that can be inadequate for circuits with multiple feedback paths. 

The predicate method is the most widely used specification technique in the HOL 
community [Gor86]. The disadvantage of the predicate method is that designers are 
likely to find it the most unfamiliar of the three and thus difficult to use. In addition, 
to use the predicate method, the logic must support existential quantification, either 
explicitly or implicitly. (Prolog’s Horn clause notation is an example of a language 
with implicit existential quantification.) The predicate method does, however lend 
itself to a wide variety of circuit types, including those with multiple outputs and 
bi-directional ports. 

The specifications in this report will use a mixture of the functional and predicate 
methods. Functions will be used inside the specification, but the device structure 
will be specified using predicates. 


2.4.1 Specifying Circuits with Predicates. 

As an example of the predicate model, we will specify the behavior and structure 
of a very simple circuit designated D. The predicate that specifies the behavior of 
the circuit cam be given by the following high-order logic definition: 


\- it f D(a,b,c,d,out) = out = (a A b) V (c A d) 
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Figure 2.1: Implementation of a simple circuit, D 

Notice that the inputs and outputs are all included in the arguments and the be- 
havior is expressed as a constraint among the outputs and the inputs. 

One possible implementation for D is shown in Figure 2.1. As was mentioned 
earlier, each device can be thought of as representing a constraint on its inputs 
and outputs. For example, the top And gate constrains a, b, and p in a manner 
consistent with the behavior of the device. 


hj e j And(a, b, p) * (p * a A b) 


To get the constraint represented by the entire device, we can compose the individual 
constraints using conjunction. 


And(a, b, p) A And(c, d, q) A Or(p, q, out) 


This expression constrains the values not only on the ports of the device, a, b, c, d, 
and out, but also on the internal lines p and q. We normally wish to regard such 
a device as a “blackbox” and consequently are really only interested in the values 
of the external lines. We can hide the internal lines using existentially quantified 
variables and define a predicate D_imp that represents the structure of the circuit. 


bjt/ D_imp(a, b, c, d, out) = 

3 p q. And (a, b, p) A And(c, d, q) A Or(p, q, out) 


For comparison, the following gives a specification of the same circuit using func- 
tions: 


t ~itf D(a,b,c,d) = Or(And(a,b) ,And(c,d)) 


The outputs are not mentioned explicitly; the result of the function is taken to be 
the output of the circuit. 

Similarly, we can write a extensional specification of the circuit in a language such 
as VHDL [Arm89]: 
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Entity D.imp is 

port (a, b, c, d :in Bit; outp :out Bit); 
end D.imp; 

architecture Structure of D.imp is 

component ANDGate port(il,i2:in Bit; outp :out Bit); 
component ORGate port(il,i2:in Bit; outp :out Bit); 
signal p, q: Bit 

Gl: ANDGate port map (a, b, p); 

G2: ANDGate port map (c, d, q) ; 

G3: ORGate port map (p, q, outp); 

end Structure; 


The difference between this specification and the predicate model of the circuit 
structure is largely superficial. The primary difference is the abundance of keywords 
in the extensional specification. The biggest impediment to using specification lan- 
guages such as VHDL is that they sometimes lack a clear semantics. This problem 
can be overcome by defining a semantics of the specification language in the object 
language of a verification such as HOL. Van Tassel has done just that using VHDL 
and HOL in [Tas89,TH89]. 

2.4.2 Specifying Sequential Behavior. 

The last section specified a simple combinatorial circuit. We specify the behavior 
of sequential circuits in higher-order logic using an explicit representation of time. 

For example, we can specify the behavior of a simple latch as follows: 


\- it} latch in out set = V t. out (t+1) = set t -* in t I out t 


In the specification, in, out, and set are functions of time. The value of a signal 
at time t is returned when the function representing the signal is applied to t. The 
specification says that the value of out at time t + 1 gets the value of the input 
port, in, at time t if the set fine is high and remains unchanged otherwise. Notice 
the use of the universal quantification over time in defining the predicate. 

We can also use existential quantification to describe temporal operators. For 
example, suppose that we wish to define a predicate that says that a signal will 
eventually go high. The following is a definition of an EVENTUALLY operator: 


I - itf EVENTUALLY d tl - 3 t2. t2 > tl A d t2 
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When applied to the signal d, and the current time, tl, the predicate states that 
there exists a time, t2, in the future when the signal d will be true. The use of exis- 
tential quantification over time is also used to specify the behavior of asynchronous 
interconnections between devices. 

Many of the specifications in this report will be sequential and will use this explicit 
representation of time. In addition to these uses of explicit quantification to treat 
sequential behavior, Joyce [Joy89a] has shown how temporal logic can be embedded 
in higher-order logic. 


2.4.3 Abstraction and Specification. 


Specifications can be written for many purposes. For example, in specifying a two 
input binary decoder, one might write: 


1 -& t f decoder.spec sO 

si oO ol 

CO 

o 

CN 

O 

S 





(oO * (si — ► 

(sO -+ F 

1 F) 
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T))) 
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/’■'N 
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It 

/-'N 
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(sO -» F 

1 F) 

(sO 


T 

F))) 

A 

(o2 = (si — > 

(sO — F 

1 T) 

1 (sO 

-► 

F I 

1 F))) 

A 

( o3 = (si — * 

(sO T 

1 F) 

1 (sO 
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F I 

1 F))) 



While this specification works, its meaning is not very clear. 
Here is another specification for the same behavior: 


\~dtf decoder. spec 

sO 

si oO 

ol o2 o3 = 

(oO = -isl 

A 

-iSO) 

A 

(ol * ->sl 

A 

sO) 

A 

(o2 = si 

A 

-»s0) 

A 

(o3 = si 

A 

sO) 



This specification closely models one possible implementation for the circuit; conse- 
quently, using it as the behavioral specification would make the verification easier, 
but would not tell us much about the abstract behavior of the decoder. 

The next specification is more abstract and says more about the behavior of the 
decoder: 


I ~i t f decoder.spec sO si oO ol o2 o3 = 
(oO <-* ((sl.sO) = (F,F))) A 
(ol ~ ((sl.sO) = (F,T))) A 
(o2 «-» ((sl.sO) = (T.F))) A 
(o3 «-» ((sl.sO) = (T.T))) 
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This specification clearly shows the binary numbers being represented by the inputs. 
Moreover, the specification does not suggest any particular implementation. In 
general, the more abstract a specification, the easier it is to understand, but more 
difficult it is to verify. 

We could make the above specification even more abstract by defining a function, 
pairval, that converts boolean pairs into numbers and then writing the specifica- 
tion as follows. 


decoder.spec 

sO 

si 

o 

o 

o 

o 

to 

o 

to 

11 

let 

n = 

pairval (s 1 ,s0) in 

(oO 


(n = 

0)) A 

(ol 

<-* 

(n = 

D) A 

(o2 


(n = 

2)) A 

(o3 


(n = 

3)) 


This specification can be easily generalized to have n inputs and 2 n outputs. 
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Chapter 3 


Interpreters 


In Chapter 2, we presented a description of a general model for specifying the 
behavior of microprocessors. The model had four parts: 

1. A representation of the state, S. 

2. A set of state transition functions, J, denoting the behavior of the individual 
instructions of the microprocessor. 

3. A next state function, N, that selects a function from the set J according to 
the current state. 

4. A predicate, I, relating the state at time t + 1 to the state at time t by means 
of J and N. 

In this chapter we concentrate on this model, which we call the interpreter model. 

We begin with an informal discussion of the interpreter model. Much of our 
discussion follows that of [Anc86]. The chapter continues with a description of 
hierarchical decomposition. Finally, we present a mathematical definition of the 
interpreter model. This mathematical definition is formalized in Chapter 4. 

The top level view of am interpreter is shown in Figure 3.1. The distinguishing 
feature of an interpreter is that it has a flat control structure. One of n instructions 
is chosen based on the state. The chosen instruction operates on the state and the 
cycle begins anew. In a progra mmi ng language, this model could be described using 
a case statement in a while loop. There are a large number interesting computer 
systems that have a flat control structure: microprocessors, low-level system calls 
in an operating system, language interpreters, and editors are a few. Each of these 
is an instance of our general interpreter model. 

The interpreter model is useful for modeling multiple abstraction levels in a mi- 
croprocessor specification. So, before discussing the interpreter model, we introduce 
hierarchical decomposition. 


3>, 




i r 


if V 


£i-A : * M.T FILMS} 


31 



Figure 3.1: An interpreter has a flat control structure. 

3.1 Hierarchical Decomposition. 


The goal of our work is microprocessor verification. There axe two properties of a 
microprocessor specification that make its verification difficult: 

1. The size of J mo cT<» the instruction set at the macro-level, is large. A typical 
instruction set has on the order of 2 6 to 2 8 instructions. For example, the 
original VIPER proof had 128 instruction cases. 

2. The specification describing the electronic block model, M.ebm, is large. The 
formal specification of the electronic block model for a typical microprocessor 
can take many pages. The expanded expression describing VIPER’s electronic 
block model is 7 pages long. 

According to the instruction correctness lemma (introduced in Section 2.2), we need 
to show that the electronic block model correctly implements each instruction in 
the macro-level in order to verify the microprocessor; this results in hundreds of 
multi-page theorems that must be proven. 


3.1.1 The Hierarchy. 

In order to reduce the number of long, difficult theorems, we have developed a 
strategy for describing the specification by means of a series of increasingly ab- 
stract interpreters. This strategy, which follows conventional microprocessor design 
practice, is shown in Figure 3.2. 

At the bottom of the hierarchy is a structural description of the electronic block 
model. By specifying the electronic block model as a circuit rather than an inter- 
preter, we ground the abstract behavioral descriptions in the hierarchy to the circuit 
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Figure 3.2: A microprocessor specification can be decomposed as a series 
of interpreters. 

model, which is familiar to hardware designers. We specify circuits as conjunctions 
of predicates as described in Section 2.4. 

The electronic block model is the lowest level that we will consider in this disserta- 
tion; below the electronic block model, the circuit no longer behaves as a computer, 
but rather as pieces of a computer. In order to implement the computer, however, 
the electronic block model would have to be reduced to gates. 

The phase-level specification describes the behavior of the electronic block model 
from the perspective of the register transfer actions. During each phase, or clock 
sub-cycle, a set of elementary operations is executed in parallel by the machine. 
The phase— level specification ties each set of operations to a particular phase of the 
clock and states how the clock is sequenced. The sequencing of phases is usually a 
trivial serialization, although some conditional operation may be present in order 
to respond to asynchronous external events and error conditions. 

The phase-level either implements the macro-level directly (in a hardwired ma- 
chine) or implements the micro-level. In the latter case, the actions taken during 
each phase are conditioned upon the contents of the microstore. Thus, every mi- 
croinstruction is implemented by a composition of phases operating on the contents 
of the microstore. 

The micro-level description (if present) is a behavioral model of the micro-level 
interpreter. The micro-level is an abstraction of the phase-level: 

• Time at the micro-level is more coarsely grained than time at the phase-level. 

At the micro-level, time is measured by the execution of a single microinstruc- 
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tion. At the phase-level, time is measured by the execution of a single phase. 

• The state at the micro-level is a subset of the state at the phase-level. For 
example, there will be latches at the phase-level that are not important in de- 
scribing the behavior of the micro-level. These latches would not be included 
in the description of the micro-level state. 

• The behavioral description at the micro-level is concerned with a courser 
sequence of actions. Rather than concentrating on what happens in parallel 
in the datapath of the CPU, we concentrate on the state transition effected 
by an entire microinstruction. 

At the top is the macro-level — the level visible to an assembly language pro- 
grammer. Just as the micro-level is an abstraction of the behavior specified by 
the phase-level interpreter, the macro-level is an abstraction of the micro-level 
interpreter. 

• Time at the macro-level is measured by the execution of a single macroin- 
struction. This instruction will be implemented by multiple microinstructions. 

• The state at the macro-level is a subset of the state at the micro-level. For 
example, the instruction register is usually not visible at the macro-level. 

• The behavioral specification of the macro-level describes the state transition 
for an entire macroinstruction. 

The hierarchical decomposition, and in particular the explicit representation of 
the phase-level as a behavioral specification, can significantly reduce the number of 
long, difficult theorems that must be proven in a microprocessor proof. The next 
section shows why this is so. 

3.1.2 Hierarchical Verification. 

We wish to establish that the structure specified in the electronic block model 
implies the behavior of the macro-level. Past microprocessor verification efforts 
[Hun87,Joy88,Coh88a] have been done in one step, directly showing that 

I.EBAr => Im aero* 

As we have seen, this can make the proof intractable for large microprocessors, 
due to the many long lemmas that need to be proven to establish the instruction 
correctness lemma. In fact, the VIPER verification [Coh88a] was never completed 
for this very reason; funding to complete the verification ran out before all of the 
cases could be considered. 
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The hierarchical decomposition discussed in the last section provides a way of 
making the proof tractable: we can establish 

I EBM => Im aero 


in stages by showing 


I EBM =► I pha $e ^ In 


Imac 


It may not be immediately obvious how this decomposition has solved the problem 
of verifying a large microprocessor. Recall that two things combine to make the 
verification of a level, £, in terms of another level, £', difficult: 


1. the size of the term describing the implementing level, I/», and 

2. the number of instructions in the instruction set of the level being verified, J*. 


The decomposition makes the proof tractable because although I ebm is still large, 
J p hate typically contains from 2 to 4 instructions instead of the 2 6 to 2 8 instructions 
in the macro-level of a typical microprocessor. Thus, the number of long, difficult 
theorems is reduced by at least an order of magnitude. 

The proof that the electronic block model correctly implements the phase-level 
interpreter is tedious, but can be done since the number of cases is small. The 
decomposition has, however, increased the total number of cases to be considered 
since we must now prove that the phase-level correctly implements the micro-level 
and that the micro-level correctly implements the macro-level. Fortunately, the 
proofs of I phate => Im.cro and Im^ 0 =>• I m acro are very regular and most of the 
work in the proofs can be automated. The proofs between interpreter levels are 
automatable for two reasons: 


1. Both specifications have similar structure; they are both interpreters whereas 
the electronic block model description is a circuit. 

2. In a proof that one interpreter implements another, one cam generally avoid 
dealing with the expanded form of the implementation, and so the goals are 
much smaller. 


As we will see in Chapter 5, even though there may be a large number of cases to 
consider in proving the instruction correctness lemma in these two levels, the cases 
are all similar and a single tactic suffices at both levels. Thus, the amount of human 
effort required to complete the proof is not substantially increased by the proofs of 
these two levels. 
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Figure 3.3: A hierarchy of interpreters. 

3.2 Interpreter Hierarchies. 

The hierarchical decomposition of microprocessor specifications leads us to con- 
sider a general model of interpreter hierarchies. The most important issue in the 
interpreter model with respect to interpreter hierarchies is using one interpreter to 
implement another interpreter. 

Figure 3.3 shows a hypothetical interpreter hierarchy. In the hierarchy, the top 
level interpreter, Ij, is implemented by the one below it, I 2 , which is implemented 
by the one below it, and so on. Each interpreter in the hierarchy is an abstraction 
of the one below it. They receive input from the environment, communicate with 
the interpreters above and below, and use some abstraction of the state. 

Because the interpreters on the top of the hierarchy are simply abstractions of the 
ones below them, they are not causal agents. Only the bottom interpreter sees the 
complete state (not an abstraction). Consequently, the bottom interpreter in the 
hierarchy is the only one that can modify the state. An interpreter in the hierarchy 
can only affect the state by issuing instructions to the interpreter below it. 

Figure 3.4 shows an individual interpreter from the hierarchy in more detail. The 
interpreter receives instructions from the interpreter above and issues instructions 
to the interpreter below. The interpreter does not merely pass the instructions 
along, but issues a new instruction stream to the interpreter below based on an 
interpretation of the instruction it has been asked to execute. The overall effect 
of this instruction stream is the state change required by the instruction being 
interpreted. When the interpreter is ready for the next instruction, it signals the 
interpreter it is implementing on the Next line. 

Figure 3.4 shows the state being filtered through an abstraction box before being 
sent to the interpreter. The filter has a switch line connected to the Next line. The 
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Figure 3.4: Interconnecting interpreters in the hierarchy, 
abstraction serves two purposes: 

1. When the Next line is raised, it takes a snapshot of the current state. The 
interpreter at this level does not see the finer grained state changes of the 
interpreters below it; the interpreter only sees the state when it is time to 
make a decision about which instruction to issue next. 

2. The filter also performs a data abstraction on the state. As we will see in more 
detail later, the state visible at one level is a function of the overall system 
state. 


3.3 A Mathematical Definition of Interpreters. 


The rest of this chapter gives a mathematical definition to the interpreter model. 


3.3.1 Basic Types. 

The basic types for our model are defined in Table 3.1. In addition to these basic 
types, we also use the following type constructors: product, written (a x /3); 
coproduct, written (a + (3) ; function, written (a — ► /?); and list, written ( a)list . 
An n-tuple is given by 

(ai x a a x . . . x a„_i x a n ) 


which is a shorthand for 

(ai x (a 2 x . . . x (a n _i x a n ) . . .)) 
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Table 3.1: Basic types for interpreter definition. 


Symbol 

Members 

Meaning 

T 

{true, false} 

truth values 

N 

{0, 1 , - - -> 

natural numbers 

B 

N — ► T 

bit vectors 

M 

N -*■ B 

stores 


3.3.2 State. 

Abstractly, we think of state as being something of type S, where S is an uninter- 
preted type. This allows us to treat state in an abstract manner, knowing nothing 
of its structure or content. 

More concretely, we can represent state using n-tuples. We let S n be the domain 
of n-tuples representing state. These n-tuples have the type 

(ai x a 2 x . . . x a n _i X a n ) 


where 

Vi. a, € T + B + M 

We write S < S' to indicate that S is an abstraction of S'. The fact that S is an 
abstraction of S' implies that there exists a function, <5 : S' — ► S. The function <5 
is called the state abstraction function. 


3.3.3 Time. 

In general, different levels in the interpreter hierarchy will have different views of 
time. We use temporal abstraction to produce a function that maps time at one 
level to time at another. Figure 3.5 shows a temporal abstraction function T . The 
circles represent clock ticks. Notice that the number of clock ticks required at the 
implementing level to produce one clock tick at the implemented level is irregular. 
The temporal projection, T , can be defined recursively on time. The resulting 
function is monotonically increasing and maps time at the implemented level to 
time at the implementing level. 

We will use members of N to represent time, Thus we define T : N — ► N such 
that 


Vn m. (n > m) =*> (^"(n) > T{m)) 

We will discuss temporal abstraction in detail in Section 4.2.1. 
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Figure 3.5: A temporal abstraction function maps time at one level to time 
at smother level. 

3.3.4 State Streams. 

A state stream, U, is a function from time to state, N — * S. We have chosen 
n-tuples of booleans, bit-vectors, and stores to represent state. We would like a 
representation of streams such that the application of a stream to some time, t, 
yields an n-tuple representing the state at time t. We use a lambda expression for 
our concrete representation. 

At. (ai(t), a2(t), • • • , ®n-i(^)> a n(0) 

where 

Vi. ai € N — + (T + B + M) 

An important part of our theory will be the abstraction between state streams at 
different levels. When we say that state stream u is an abstraction of state stream 
u', we are saying 

1. that the members of u are state abstractions of the members of u' and 

2. there is a temporal mapping from time in u to time in u ' . 

There are two distinct types of abstraction going on: the first is a data abstraction 
and the second is a temporal abstraction. 

Using the state abstraction function, S, and the temporal abstraction function, 
T, we say that u is an abstraction of u' if and only if 

3(5 : S' S). : N — ► N). (uof) = (5o u') 

where o denotes function composition. When this is true, we write 

u ~< u' 
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3.3.5 Environments. 


The environment represents the external world and it plays an important part in 
our theory. The environment is where interrupt requests originate, reset signals are 
generated, and so on. In our model, the environment is used only for input; output 
to the environment is assumed to be simply a function of the state. 

At the abstract level, we treat the environment as an abstract object. We know 
nothing about its structure or content. We denote it as E. Just as we defined S, 
the state abstraction function, we define an environment abstraction function, £, 
such that £ : E' — > E. 

Concretely, we represent for the environment using n-tuples of booleans and bit- 
vectors. We perform the same kinds of abstraction on the environment as on states: 
we assume that there exists a function, £, that abstracts one environment tuple 
to another. Temporal abstraction is performed as it was for states. We define 
abstraction for environment streams in the same manner that we defined it for 
state streams. Thus, we write e -< e' when e is an stream abstraction of e': 

3(£ : E' -► E). 3(T : N -*• N). (eof) = (fo e') 

3.3.6 The Interpreter Specification. 

The preceding parts of this section have given preliminary definitions for concepts 
important in the mathematical definition of interpreters. This section presents that 
definition. 

Interpreters are state transition systems. The difference between our model of 
interpreters and other models of state transition systems such as deterministic finite 
automata ( dfa ) is that our model accounts for state abstraction and aggregation. 
By state aggregation, we are referring specifically to stores. A store represents a 
collection of state that we deal with as a monolithic unit. In a dfa model, each 
location in memory would be represented by a different piece of state which would 
be treated individually. This clearly would not work for sizable memories. 

The first step in defining an interpreter is to define a set of instructions. Let 
J* be the set of all functions with domain (S x E) and codomain S. Of course, 
not all functions in J* axe meaningful; the specifier’s job is to choose meaningful 
functions. We use a subset of J* to represent the instruction set; we call this set J. 
The functions in J provide a denotational semantics for the instructions that they 
represent. 

In order to uniquely identify each instruction in J, we associate it with a unique 
key. At the abstract level, we take keys from the domain K. At the concrete level, 
keys can have various representations, as we will see in the example in Chapter 5. 
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We must be able to choose instructions from J according to some predefined 
selection criteria. Usually, the selection will be based on the current state and 
environment- We define K. to be a function with domain (S x E) and codomain K. 
Further, we define C to be a choice function that has domain (J x K) and codomain 
(S x E — > S). That is, C picks the state transition function from J that has a 
particular key in K. 

We define an interpreter, I[s,e], as a predicate over the state stream, s, and the 
environment, e. The definition of I is given as 

IM = s(t + l)=C(3,k)(st)(et) 
where 


k = )C(a, e) 

The predicate constrains the state of the interpreter at time t + 1 to be a function of 
the state and environment at time i. The function is determined by the instruction 
currently selected by 1C. 

3.3.7 Interpreter Verification. 

The goal of this formalization is to prove a correctness relation between the inter- 
preters at different levels of a microprocessor abstraction. In particular, for two 
state streams, at and s*, and two environments, e/ and e*, where at •< 3k and 
et ^ e k, we wish to show that 

efc] oT,e t o T] 

where T is the temporal abstraction function defined in Section 3.3.3. When this 
implication is true, 1/ is an abstraction of I* and I* is said to implement I/. 

We leave the proof of this for the formalization in Chapter 4. 


3.4 Composing Specifications. 


We have begun to examine how verified components can be composed to implement 
a more abstract behavioral specification. In the simplest case, where there is no 
shared state between the components, the problem reduces to the structural speci- 
fication problem discussed in Section 2.4.1. When the devices share state, however, 
the problem is more difficult. 
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Consider a system consisting of a CPU and a memory subsystem with memory 
mapped I/O. We might be tempted to specify a single-ported memory unit as 
follows: 


\~i t f MEMORY .UNIT write read memory address port = 

(read t => (port * fetch (memory .address))) A 
(write t — ► (memory(t+l) = store(memory t, address, port)) 
I (memory(t+l) = memory t) 


This specification says that when the read signal is true, the port carries the 
value of memory at address. If the write signal is true, the memory is updated 
by storing the value on the port to the location given by address. Otherwise the 
value of memory remains unchanged. 

There is a problem with this specification if we expect to use the CPU with 
memory mapped I/O. The specification assumes that the CPU is the only device 
that can change memory. Obviously this is not the case if the memory is shared by 
the CPU and other devices making up the I/O subsystem. 

There are two obvious fixes to the problem: 

1. Use another kind of I/O that doesn’t require sharing the memory. 

2. Specify all of the I/O in the system being careful to account for all the changes 
that can occur to memory. 

The first solution is unpalatable since we would like to be able to specify a design 
with memory mapped I/O. The second is equally distasteful since it requires that 
we know all of the I/O needs that they system will ever have up front when the 
system is initially specified, or reverify the system with every change. 

Now consider the following specification of the memory unit: 


\~dtf MEMORY.UNIT write read memory address port = 

(read t (port = fetch (memory .address))) A 
(write t — ► (memory(t+l) = store(memory t, address, port)) 
I (memory (t+1) = trans (memory t)) 


The only difference is the addition of a function, trans, that transforms the value 
of memory at time 1 to a new value of memory at time t + 1 when a write is not 
occurring. 

The transformation function represents all of the changes that are occurring in 
memory for which the CPU is not responsible. Consider the stylized specifications 
for the simple case of a piece of state, 5, shared by two devices. Device A specifies 
S as 
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S(t+1) * sig t -*• Storages t) 
| trans^CS t) 


Device B specifies S as 


S(t+1) * Big t — » storeaCS t) 
I transgCS t) 


At a given time, trans B is either storey or I, the identity function. Simi- 
larly, transyi is either stores or I. 

We cannot know at the time of specification what the value of the transformation 
function will be. The use of an uninterpreted function (from a generic theory) as 
the transformation function allows the function to appear as a place holder in the 
proof. Later, when the specification is composed with the specification of another 
device, the uninterpreted transformation functions in both specifications can be 
instantiated with the appropriate values. 

Having the transformation function appear directly in the specification, as was 
just shown, has a disadvantage: we must specify every level of the device using 
transformation functions and, more importantly, must deal with temporal issues 
between levels as they relate to the transformation functions. For example, the 
transformation occurring on the state at the micro-level is a composition of the 
smaller transformations occurring to the state at the phase— level. Thus, we need a 
different transformation function at each level and we must know how they compose. 
This places a large burden on the proof, as these assumptions about the abstract 
transformation functions will all have to be put into the proof and discharged when 
the devices are composed. 

There may be times when we need to know the transformations taking place in 
a piece of shared state in detail. Many times, however, we can use some abstrac- 
tion of the transformation and only look at changes to the state using a courser 
time granularity. We do not want to have the transformation functions appear in 
the lower levels of the specification. In these cases, we can put the transformation 
functions in the state abstraction function. By putting the transformation in the 
state abstraction function for the micro-level state, for instance, the transformation 
function appears in the specification of the macro— level, but not in the specifica- 
tion of the micro-level (or below). This is the technique used in the specification 
of AVM-1] we present a concrete example of its use in Chapter 5. 
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Chapter 4 


The Formal Models 


This section presents a formalization of the theory developed in the last section. 
There are two points to make before we formalize the mathematical definition: 


• We are free to make some of the abstract entities in the mathematical def- 
inition more concrete. For example, we will represent the instruction set as 
a list. What we make concrete and what we leave abstract is a subjective 
choice. We want to make the definition concrete enough that we can prove 
interesting theorems about it without restricting the model in unnecessary 
ways. 

• There will be more details to consider concerning types, definitions, and so on 
since we are dealing with a formal system. HOL’s polymorphic type system 
frees us from some of this, but the details still have to be right. 

This chapter formalizes two interpreter models: a synchronous model and an 
asynchronous model. The terms “synchronous” and “asynchronous” are historical. 
Our original, synchronous model was too restrictive to support asynchronous mem- 
ory and thus was described as a model for synchronous memory machines quickly 
shortened to the synchronous model. The less restrictive model was naturally called 
“asynchronous.” Perhaps the terms are unfortunate since, as we will see, the tem- 
poral abstraction in both models is synchronous. The synchrony is deterministic 
in the “synchronous” model and non-deterministic in the asynchronous model. 
Of course, calling the models “deterministic” and “non-deterministic” would be 
confusing as well and we have chosen to keep the historical names. The first part 
of this chapter will present the theory of synchronous interpreters. The second part 
of the chapter presents the less restrictive model of asynchronous interpreters. 


4.1 Synchronous Interpreters 


The theory presented in this section is for synchronous interpreters. In a syn- 
chronous interpreter model, the number of instructions in the implementation re- 
quired to implement each instruction in the interpreter is deterministic. Obviously 
for microprocessors that have loops in their microcode or use asynchronous memory, 
the synchronous model will not suffice. We present it, however, because it is more 
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straightforward and somewhat easier to use; there are times when one can live with 
the restrictions of the synchronous model. 

4.1.1 The Abstract Representation. 

The abstract representation is the interface to the generic theory. We specify the 
abstract representation by defining a list of abstract objects and operations. The 
operations jure functions with domains consisting of both abstract and concrete 
types. 


Not all of the members of the abstract representation are used in the definition of 
the interpreter. Some of them axe only used to specify the theory obligations and 
formulate the correctness statement. 

Before describing the abstract representation, we must emphasize that the repre- 
sentation is abstract and therefore, the objects and operations have no definitions. 
The descriptions that follow are what we intend for the representation to mean. 
The representation is purely syntactic, however; the names are simply convenient 
mnemonics. 

We begin by giving a description of the abstract types used in the representation. 
We know nothing of the structure or composition of an abstract type. 

• : * state represents the state and corresponds to S in the informal description 
presented in Chapter 3. 



46 



• :*env represents the environment and corresponds to E in the informal de- 
scription presented in Chapter 3. 

• :*key is type containing all of the keys and corresponds to K in the informal 
description presented in Chapter 3. 

In addition to these abstract types, the representation makes use of several concrete 
types: :time, :num, and :bool. The list and — > (function) type constructors are 
used as well. We add primes to the types to indicate that they represent state, 
time, etc. at the implementing rather than the implemented level. 

As we mentioned earlier, there is a trade-off between the concreteness of the 
representation and the strength of the final result. We could make the specification 
of generic interpreters more abstract, but the result would likely be weaker. We 
could make it more concrete to strengthen the conclusions, but then we risk making 
it unusable. A good example of this trade-off is the representation of the instruction 
set. 

Our consideration for concreteness led us to discard a completely abstract object 
such as :*inst_set. As a practical issue, the theorems and tools for manipulating 
lists are codified much better in HOL than they are for sets. Since we will be 
proving results about the instruction set in order to instantiate the abstract theory, 
we chose lists over sets as the aggregation mechanism. 

The abstract function inst_list corresponds to J in the mathematical definition 
presented in Chapter 3. The instruction set, inst_list, is a collection of state 
transition functions and is denoted by a list of pairs. The first member of the pair 
is a key and the second member is a state transition function which operates on a 
state object and an environment object to produce a new state. 

The second member of the representation is the select function that picks a key 
based on the present state and environment. The select function corresponds to 
K from the definition in Chapter 3. 

The key returned by the select function is used to choose a member of the 
instruction list. The third member of the representation is used in indexing the 
instruction list. Since our representation for the instruction set is a list, key maps 
an object of type :*key to a number which is used with the HOL list indexing 
function EL to pick an instruction from the list. Together, EL and key correspond 
to the function C from the definition in Chapter 3. 

The model presented here is synchronous and therefore the proof requires that 
the number of cycles in the implementation be determinate for each instruction in 
inst_list. The function cycles returns, for each object of type :*key, the number 
of cycles used to implement the instruction associated with that key. 

The function substate, which corresponds to S in Chapter 3, is the state abstrac- 
tion for the interpreter. Notice that the domain of substate is primed indicating 
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that it is from the implementing level. The function subenv, which corresponds to 
£ in Chapter 3, is the environment abstraction function. 

The definition in Chapter 3 did not treat the implementation. Because we want 
to prove correctness results about the interpreter, we must have something to verify 
it against. The final three functions in the abstract representation provide the 
necessary abstract definitions for the implementation. 

Impl is the abstract implementation. We could have chosen to make this function 
more concrete and define it as we do the interpreter (see Section 4.1.3), but doing 
so would require that every implementation be an interpreter or at least have some 
pre-chosen structure. As we will see in the example (Chapter 5), the implementation 
need not be modeled as interpreter at all. Thus, we say nothing about it besides 
defining its type. For now, its structure and operation Eire completely unknown. 

The abstract function count is analogous to select except it operates at the 
implementing level. Notice that it uses the state and environment at the imple- 
menting level to produce a key for the implementing level. As we will see, this 
function is important in synchronizing the two levels. In the course of the verifica- 
tion we will ensure that the implementation periodically reaches the beginning of 
its cycle, denoted by the last member of the abstract representation, begin. 

We must emphasize once again that even though we have spent several paragraphs 
defining what each of the members of the abstract representation mean, they are 
truly abstract and have no meaning in the formal theory other than the relationships 
that will be defined in the theory obligations. 


4.1.2 The Theory Obligations. 

Theory obligations represent the semantics of the interface to the generic theory. In- 
side the theory, the only thing we know about the abstract representation presented 
in the last section is what the theory obligations say about it. 

What properties should the theory obligations have? 

• We would like the theory obligations to be sufficient to prove the correctness 
result. We make no claim that they are sufficient to prove any other property 
about our model. 

• We also would like, but do not require, that the theory obligations till be 
necessary to prove the correctness result. An unnecessary obligation must be 
satisfied over and over again for every instantiation, even though it follows 
from the other obligations and definitions in the abstract theory. We ignore 
obviously unnecessary obligations that are never used in proving the theorems 
in the abstract theory. 
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We cannot prove that all of the obligations in the generic interpreter theory 
are necessary, but there are only three of them and they seem reasonably 
disjoint. 


To prove the correctness result, we must know something about the implementa- 
tion. Since the implementation is a member of the abstract representation, nothing 
is known about it except the requirements set forth in the theory obligations. Prov- 
ing that the implementation implies the interpreter definition is typically done by 
case analysis on the instructions; we show that when the conditions for an in- 
struction’s selection are right, the instruction is implied by the implementation. In 
Section 2.1 we called this the instruction correctness lemma. 

The predicate INSTRUCTION-CORRECT expresses the conditions that we require in 
the instruction correctness lemma. 


\- dtf INSTRUCTION-CORRECT rep s’ e’ inst = 

(Impl rep s’ e’) ==> 

(V t:time’ . 

let s = (At. (substate rep (s’ t) ) ) in 

let e * (At. (subenv rep (e’ t))) in 

let c = (cycles rep (select rep (s t) (e t))) in ( 

(select rep (s t) (e t) = (FST inst)) A 

(count rep (s’ t) (e’ t) = (begin rep)) => 

((SND inst) (s t) (e t) = (s (t + c))) A 

(count rep (s’ (t + c)) (e’ (t + c)) = (begin rep)))) 


INSTRUCTION-CORRECT is not really as complicated as it looks. The predicate op- 
erates on a single instruction inst. The implementation implies that for all time, 
if inst is selected and the implementation’s counter is at the beginning, then two 
things are true: 

1. Applying the instruction to the current state yields the same state change 
that the implementation does in c cycles and 

2. The counter in the implementation returns to the beginning of its cycle after c 
cycles. 

In all cases the number of low-level cycles it takes to implement one upper-level 
instruction must be determinate. The use of the abstract function cycles to deter- 
mine how long the instruction takes to run, c, enforces this condition. 

Using INSTRUCTION-CORRECT we can define the theory obligations: 
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nen.theory.obligations 

C 

"EVERY (INSTRUCTION.CORRECT rep s’ e’) (inst.list rep)" 
* 

"V k:*key. (key rep k) < (LENGTH (inst_list rep))" 

> 

"V k:*key. k * (FST (EL (key rep k) (inst.list rep)))" 

f 

];; 


The first obligation says that every instruction in the instruction list, inst_list, 
satisfies INSTRUCTION-CORRECT. The second obligation says that every key maps to 
some location in the instruction list. The third obligation says that key actually 
maps a key to the instruction with which it is associated (i.e. that the list is ordered 
correctly). 

As mentioned in Section 2.3, the obligations are used in two ways. First they 
are used axiomatically in proving the correctness result; we will do this in the next 
section. Second, they are the properties that users of the theory must prove about 
am instantiation. We will show this in Section 5.3. As we will see, the obligations 
are really a small burden since the first obligation would have to be proven whether 
the generic theory was used or not and the other two are simple to prove for most 
instantiations. 


4.1.3 The Correctness Statement. 

Before proving the correctness statement, we must define the abstract interpreter. 
In addition to the state and environment, the interpreter is parameterized by the 
representation, rep. As discussed in Section 2.3.3, the objects in an abstract rep- 
resentation are really selection functions on a higher-order tuple in the logic. Thus 
the expression (key rep) in the interpreter definition selects the key function from 
the representation. When rep is instantiated, the selection function returns the 
concrete function for key. 


I - iff I1TERP rep (s:time — ► *state) (e:time — ► *env) = 

V t:time. 

let n = (key rep (select rep (s t) (e t))) in ( 
s(t+l) * (SND (EL n (inst.list rep))) (s t) (e t)) 


The interpreter definition corresponds to I[s, e] in the definition in Chapter 3. 
The interpreter relates the state at time t + 1 to the state and environment at 
time t through an instruction selected from the instruction list. The instruction 
is indexed in the list using the number returned from applying key to the result 
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of select and HOL’s list indexing instruction EL. The state transition function is 
the second element in the resulting pair (selected using SND). 

We wish to show that when the theory obligations are met, the implementa- 
tion implies the interpreter definition. In order to prove the correctness result, we 
will need to define the temporal abstraction between the implementation and the 
interpreter. 

When time appears in the theory obligations, it is time at the implementation 
level. Before we can relate the interpreter and its implementation, we must relate 
the different time granularities at the two levels. The relationship between the two 
representations of time can be expressed in a recursive function. 


(time_shift g s e 0 * 0) A 
(time.shift g s e (SUC n) = ( 

let t = (time.shift g s e n) in 
t + (g (s t) (e t)))) 


When applied to time at the interpreter level, time.shift returns time at the 
implementation level. Time.shift is the function T in Figure 3.5. The function g 
takes a state value and an environment value and returns the number of cycles for 
the instruction that is to be executed. We implement g using select and cycles. 
The function recurses to determine how many implementation cycles were required 
to reach the current instruction. 

The instruction correctness lemma contains a termination assumption that says 
that the implementation clock always returns to the beginning of its cycle at every 
interpreter clock tick. This assumption is too messy to appear in the final result 
since it seems difficult to discharge. Actually, we can show that a much simpler 
assumption implies the more complicated one. This is known as the clock lemma. 

The clock lemma shows that if count, the implementation level clock, is at the 
beginning of its cycle at time 0, then it will be at the beginning of its cycle for every 
clock tick at the interpreter level. 


CLOCK.LEMMA = 
h (Impl rep) s’ e’ A 

((count rep) (s’ 0) (e’ 0) = (begin rep)) => 
let s = (A t:time. (substate rep (s’ t))) and 
e = (A t:time. (subenv rep (e’ t))) in ( 

V t. let t.impl = 

(time.shift 
(A st env. 

(cycles rep (select rep st env))) set) in 
(count rep) (s’ t_impl) (e* t.impl) = (begin rep)) 


We can use a reset button in the implementation to force the clock to the beginning 
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of its cycle at time 0. 

Using the clock lemma, the theory obligations and several intermediate lemmas, 
we can prove the correctness result for our generic, synchronous interpreter. 


INTERP.CORRECT * 

I- let s * (A t:time. (substate rep (s’ t))) and 
e * (A t:time. (subenv rep (s’ t))) in ( 

(Impl rep) s’ e* A 

((count rep) (s’ 0) (e* 0) * (begin rep)) =>• 
let f = time.shift 

(A st env. (cycles rep (select rep st env))) s e in 
(INTERP rep) (s of) (e o f)) 


The state and environment variables in the correctness theorem, s and e, are func- 
tions of time at the implementation level. Of course, to use them as arguments to 
the interpreter definition, INTERP, we need to temporally abstract implementation 
time to interpreter time. Using time_shift, we can modify the state and environ- 
ment streams, producing streams appropriate for the interpreter. The expression (s 
of) represents the interpreter level state stream whereas s is the implementation 
level state stream. 

The correctness theorem states that the implementation implies the definition of 
the interpreter as long as the implementation clock starts off at the beginning of its 
cycle. Of course, the result is also predicated on the theory obligations. They are 
not visible in the theorem, but they must be discharged before it can be used. 


4.2 Asynchronous Interpreters 


The previous section presented a formed model of generic interpreters where the 
synchronization function, time_shift, operated deterministically. The determin- 
ism was provided by the abstract function cycles which returns the number of 
implementation cycles for each member of the instruction set. Often, such deter- 
ministic synchrony is not desirable, or even possible. 

• The number of implementation cycles may depend not only on the instruction, 
but on the arguments to the instruction as well. A multiply instruction is one 
example. 

• The number of implementation cycles may depend on some external device or 
signal. Examples of this include asynchronous memory, an interrupt, or user 
input. 

Since instructions with non-deterministic synchrony in their implementations oc- 
cur so frequently in computer systems, our model would not be very useful if it 
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Figure 4.1: The function F, which maps time at one level to another, can 
be defined in terms of a predicate, Q, which is true only when 
the mapping occurs. 


excluded them. This section presents a modification of the synchronous model pre- 
sented in the last section. The new model removes the restriction that the number 
of implementation cycles for each instruction be deterministic, while maintaining 
the strong correctness result. The section starts off with a discussion of a more 
general view of temporal abstraction and then presents the modified theory. 


4.2.1 Temporal Abstraction 

Section 3.3.4 presented an informal look at stream abstraction. As discussed in that 
section, a major component of abstraction over streams was temporal abstraction. 
The function time_shift, which appeared in Section 4.1.3, was an attempt to 
relate the different views of time at the implementing and implemented levels in 
the synchronous interpreter. The function was simple in concept and execution, 
but is too restrictive. This section presents the development of a formed theory for 
temporal abstraction. The development follows that of [Joy89a,Mel88,Her88]. The 
ML code creating this theory is contained in [Win90b] . 

Figure 4.1 is the seime as Figure 3.5 except for the representation of the predicate, 
Q. This predicate is true whenever there is a valid abstraction from the lower 
level to the upper level. We can define a generic temporal abstraction function 
in terms of Q. It may seem that we have given up having to define cycles only 
to be burdened by having to define Q\ but as we will see, defining Q is much less 
restrictive than defining cycles. In a microprocessor specification, Q is usually 
a predicate indicating when the lower level interpreter is at the beginning of its 
cycle — a condition that is easy to test. 

To begin, we can define First and Next, two predicates that use Q to express 
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two very important concepts. The predicate First is true when t is the first time 
that g is true. 


V it} First g t = 


(V p:time. p < t => "(g p)) A 


(g t) 


The predicate Next is true when t2 is the 

next time after tl that g is true. 

1- dtf Next g (tl ,t2) * 


(tl < t2) A 


(V t:time. tl < t A t < t2 ==>• 

'(g t)) A 

(g t2) 



We would like to define T (see Figure 4.1) using First and Next. Clearly, at time 
<1, First g tjistrue. In addition, Next g (fi.ts), Next g «£, <7), Next g (t' 7 ,t' 8 ), 
and Next g ^10) 616 * rue 68 well. How can we use First and Next, both pred- 
icates, to return the proper values? 

The axiomatization of HOL uses Hilbert’s choice operator, e. Given some predi- 
cate P, e x. P(x) represents a value satisfying P. For example, 


e x:num. x * x * 25 


denotes 5 (but not —5 as the type num only contains the natural numbers). So, using 
the choice operator, we can define T as follows (T has been renamed to Temp_Abs 
which is more mnemonic): 


\~i t f (Temp.Abs g 0 ■ t trtime. First g t) A 

(Temp. Abe g (SUC n) ■ t t:time. Next g (Temp.Abs g n,t)) 


So, Temp_Abs at time 0 is the first time g is true and Temp Jibs at time n -1- 1 is the 
next time after time n when g is true. 

The only problem with this definition is that Hilbert’s operator is difficult to use 
in proofs since the methods for handling it in HOL are relatively weak. Fortunately, 
it is possible to prove theorems about Temp Jibs that make it simple to reason about 
its behavior. Several of these are defined in the temporal abstraction theory found 
in [WindOb] . One of the most important is the following theorem that says that 
if g is true infinitely often and a relation, r, holds between points of time at the 
upper level, then the same relation holds between the times returned by Temp Jibs. 
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INF.Temp.Abs « 
h V g r. 

(3 t:time. g t) A 

(V t:time. g t =>• 3 n. Next g (t,t+n) A r(t,t+n)) =>• 
V u. r(Temp_Abs g u, Temp.Abs g (u+1)) 


Another useful theorem describes what happens when g is always true. 


Temp.Abs.DEGENERATE = 
h Temp.Abs (A t:time. T) * I 


This is a degenerate case and as intuition would suggest, Temp_Abs simply reduces 
to the identity function, I. It might not be clear why this last theorem is of use. 
In defining the generic theory, we will assume that a temporal abstraction always 
exists between levels in a specification. Such is not the case; sometimes, there 
is no temporal abstraction. Rather than dealing with this as a special case, it 
is convenient to use the general theory with a degenerate temporal abstraction 
function. 


4.2.2 The Abstract Representation 


The abstract representation for the asynchronous model is identical to the repre- 
sentation for the synchronous model except that the abstract function cycles has 
been eliminated. 


let cpu.abs s new.abstract.representation 

[ 

(‘ inst.list' : (*key#(*state->*env->*state))list") 

I 

( ‘ key * , " : *key->num" ) 

9 

( * select * , " : *state->*env->*key " ) 

t 

( ‘ substate * , " : *state ' ->*state" ) 

9 

Csubenv' ,":*env’->*env") 

» 

(‘Impl* : (time , ->*state’ )->(time , ->*env , )~>bool") 
» 

( * count * , " : estate * ->*env * ->*key * " ) 

(‘begin', ":*key’") 


]; 
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The meanings of the abstract functions in the representation are identical to the 
meanings of the functions in the abstract representation for the synchronous model. 
Of course, they axe only place holders in the definitions that follow. 


4.2.3 The Theory Obligations 


The major change in the theory obligations for the asynchronous model involves 
the instruction correctness predicate. The instruction correctness predicate for the 
synchronous model was able to calculate the number of cycles required to implement 
an instruction using the abstract operation cycles . The length of an instruction 
cycle in the asynchronous model is indeterminate, but finite. In fact, that is all we 
need to say about it to prove the correctness statement. We will say that there 
exists a time in the future when the current cycle will be over. The currently 
selected instruction applied to the current state should yield the same value as the 
state at beginning of the next cycle. 


\- itf INST.CORRECT rep s’ e’ inst = 

(Impl rep s’ e’) => 

(V t :time’ . 

let s = (A t. (substate rep (s’ t))) in 
let e = (At. (subenv rep (e’ t))) in 

let g = (At. (count rep (s’ t) (e* t) = (begin rep))) in ( 
(select rep (s t) (e t) = (FST inst)) A 
(count rep (s’ t) (e’ t) = (begin rep)) =► 

3 c. Next g (t,t+c) A 

((SND inst) (s t) (e t) = (s (t + c))))) 


As before, s and e sire the abstracted state and environment. We define, g, the 
predicate that is true when the cycle is over, by testing if count is equal to begin. 
The predicate Next uses g to constrain the existentially quantified variable, c, to 
the time when the cycle ends. 

Once the instruction correctness predicate has been defined, the theory obliga- 
tions for the asynchronous model are identical to the theory obligations for the 
synchronous model. 


new.theory.obligations 

[ 

"EVERY (INST.CORRECT rep s' e») (inst.list rep)" 

» 

"V k:*key. (key rep k) < (LENGTH (inst.list rep))" 

t 

"V k:*key . k = (FST (EL (key rep k) (inst.list rep)))" 

ll; 
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Due to the changes in the instruction correctness predicate, the theory obligations 
for the asynchronous model are less restrictive than the theory obligations in the 
synchronous model; nevertheless, they are sufficient for proving the correctness 
result. As we will see in Chapter 5, however, satisfying the less restrictive obligations 
can be more difficult than satisfying the obligations for the synchronous model; this 
can make instantiating the generic theory more difficult. 


4.2.4 The Correctness Statement 

Just as in the synchronous model, we must define the interpreter before we prove 
a correctness statement about it. The definition for the interpreter is the same in 
both models. 


I ~i'f INTERP rep s e = 

V t:time. 

let n * (key rep (select rep (s t) (e t))) in ( 
s(t+l) = (SND (EL n (inst.list rep))) (s t) (e t)) 


The specification of an interpreter is a predicate relating the contents of the state 
stream at time t + 1 to the contents of the state stream at time t. The relationship 
is defined using the functions from the abstract representation in the same manner 
as before. 

An important step in proving the correctness result is showing that the implemen- 
tation implies that the next state follows from the currently selected instruction. Of 
course, the theory obligations play an important part in proving this lemma, which 
is called the next-state lemma. 


IMPL.NEXTSTATE.LEMMA = 

h let s * (A t:time. (substate rep (s’ t))) and 
e = (A t:time. (subenv rep (e* t))) and 
f * (At. (count rep (s’ t) (e* t) = (begin rep))) in ( 
(Impl rep s’ e’) => 

(V t:time’ . 

(count rep (s’ t) (e’ t) * (begin rep)) => 

3 c. 

lext f (t,t+c) A 

((substate rep (s’ (t + c))) * 

(SND (EL (key rep (select rep (s t) (e t))) 
(inst.list rep))) (s t) (e t)))) 


The implementation— level clock is assumed to start at the beginning of its cycle 
and the Next function is used to constrain the clock so that it terminates, ready to 
start the cycle again. 
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We use the next-state lemma to prove the correctness result. There is no need 
to prove a clock lemma in the asynchronous model. The clock lemma in the syn- 
chronous model proved that the temporal abstraction function, time_shif t, be- 
haved correctly. In the asynchronous model the temporal abstraction function is 
correct by definition. The use of the choice operator says that the value will be 
correct provided one exists. 


IMPL.I.CORRECT « 

I- let s * (A (substate rep (s’ t))) and 

e * (A trtime. (subenv rep (e’ t))) and 

f ■ (A t:time. (count rep (s' t) (e' t) = (begin rep))) in 
let abs * (Temp_Abs f) in ( 

(Impl rep s’ e’) A 
(3 t. f t) =^> 

(INTERP rep) (s o abs) (e o abs)) 


In the correctness statement, s’ and e’ are the state and environment streams 
in the implementation. The terms (s o abs) and (e o abs) are the state and 
environment streams for the interpreter defined in the theory. They are data and 
temporal abstractions of s’ and e’. The correctness statement says that if the 
implementation is valid on its state and environment streams and there is a time 
when the implementing clock is at the beginning of its cycle, then the interpreter is 
valid on its state and environment streams. 


4.3 Conclusions 


We have now proven a correctness statement for two different interpreter models. 
These models each define a class of computational objects. The correctness results 
provide a verification of every microprocessor matching the loose semantics defined 
in the models. 

The most important benefit of the generic models is that they structure the proof. 
A generic model states explicitly which definitions must be made (one for each of the 
members of the abstract representation) and which lemmas need to be proven about 
these definitions (namely, the three theory obligations). This is a large improvement 
over previous microprocessor verifications where these decisions were made on an 
ad hoc basis. 

For each model, the correctness theorem, definitions, and abstractions that make 
up the theory are important for severed reasons. 

1. The models show exactly what is required to verify that an interpreter is cor- 
rect. There is no superfluous detail cluttering up the definitions and theorems. 
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2. The generic proof is easier than the specific proof. This point is subjective, 
but having completed three different specific interpreter proofs, we found that 
proving the correctness result for the generic interpreter was easier than sim- 
ilar theorems for specific interpreters. In proving theorems about specific 
interpreters, there is always some amount of detail that is necessary for the 
specific interpreter, but not meaningful in the correctness result. Even so, this 
detail must be manipulated to complete the proof. 

3. Temporal abstraction issues are handled completely within the generic theory. 
This frees the user of the theory from proving theorems about the temporal 
abstraction; it is only done once, when the theory is built. 

4. Similarly, data abstraction between the state and environment streams at 
the two levels in the theory is clearly defined and consistently performed. 
The user’s contributions are to define the abstractions, the theory uses the 
abstractions to effect the proof. 

5. The generic proof can be instantiated, allowing the theorems to be reused and 
saving the verifier from having to reverify these theorems. 


The use of a generic interpreter theory for specifying and verifying micropro- 
cessors provides a methodological approach. Making specification and verification 
methodological is an important step in turning what has primarily been a research 
activity into an engineering activity. We believe that the most important contribu- 
tion of this work may be the organization that the generic theory provides. 
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Chapter 5 


A Verified Microprocessor 


We have designed a computer designated AVM-1 (A Verified Microprocessor ) to 
demonstrate the use of generic interpreters in verifying hierarchically decomposed 
microprocessor specifications. There are several reasons why we chose to design our 
own microprocessor rather than using an existing one: 

• In order to verify a commercial microprocessor, we would have to have access 
to the design, which is likely to be proprietary. Further, the design would 
have to be correct, which is unlikely. 

• A formal specification for a commercial microprocessor is unlikely to be avail- 
able. 

• Any specification written for a commercial microprocessor would be “after the 
fact” and therefore suspect. 

• There are architectural and organizational features that can ease the burden 
of verification. An existing microprocessor might not have these features. 
Among these are regular instruction formats and microcoding. We will explain 
why these features reduce the verification effort. 

Our design is an attempt to build a microprocessor that is at once verifiable, 
implementable, and usable. We have been influenced by our own experience in ver- 
ifying mi croprocessors [Win90a], the experience of others [Joy89a,Coh88a], and our 
desire to provide hardware features in support of operating systems; such features 
include interrupts, memory management, and supervisory modes. AVM-1 is part of 
a verified chip set being designed and verified by the Computer Systems Verification 
Group at the University of California, Davis. Other pieces of the system include 
a memory management unit, a floating point unit, an interrupt controller, and a 
direct memory access chip. 

Counter to the current trend in microprocessor design, we have chosen to imple- 
ment AVM-1 using microcode. We believe that microcoding a verified design can 
reduce the amount of effort required to verify the implementation. As we mentioned 
in Section 3.1, we can hierarchically decompose the specification in order to limit the 
number of difficult cases in the proof. Recall that the difficulty is caused by the size 
of the electronic block model description and the fact that it is a structural, rather 
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than a behavioral specification. In verifying a hierarchically decomposed specifica- 
tion, these difficult cases occur when verifying the phase-level with respect to the 
electronic block model. If the microprocessor is not microcoded, the phase-level 
description becomes much more complicated and the difficulty of the phase-level 
proof is exacerbated. This is not to say that hardwired designs cannot be verified, 
just that they are more difficult. 

Another reason for using microcode in the design of a verified microprocessor 
is the opportunity it affords for easily reverifying the microprocessor when minor 
changes to the design Me made. As we will see, the most difficult part of a micropro- 
cessor verification is proving the correspondence between the electronic block model 
and the phase-level. The phase-level description can be parameterized over the mi- 
crorom so that the microrom, and consequently the microprocessor’s behavior, can 
be changed without having to redo the difficult phase-level verification. Once a 
verified phase-level interpreter exists, establishing a proof for a new macro-level can 
be accomplished with little additional effort. 

• Because of the regularity of the proofs for the macro-level and micro-level, 
general purpose tactics can be devised to verify these levels. 

• The proof can be completed by defining the microcode, reverifying the new 
design using the tactics mentioned above, and instantiating the generic inter- 
preter theory to generate the proof. 


Thus a microprocessor design can be customized at very little additional cost after 
the initial micro-engine has been verified and tactics for verifying the higher levels 
in the hierarchy have been developed. 

This chapter presents a detailed example of how the generic interpreter theory 
can be used to verify a microprocessor. We begin with a discussion of the archi- 
tecture and organization of AVM-1. The second section of the chapter formally 
specifies each of the levels in the hierarchical decomposition of AVM-1. The last 
section describes the development of a correctness proof for AVM-1 using the formal 
specifications and the HOL verification environment. 


5.1 AVM-V s Architecture and Organization. 


We distinguish between a computer’s architecture and its organization. The former 
is behavioral in nature and the latter structural. Our goal in this chapter is to 
show that a particular organization correctly implements our desired architecture. 
This section will give a brief, natural language description of the architecture and 
organization of AVM-1. Later in this chapter, we will present a formal specification 
of both using higher-order logic. 
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5.1.1 An Architectural View. 


A computer’s architecture is its programming interface; an architecture describes a 
language and how that language is interpreted. The language definition contains a 
specification of the computer’s state and the instructions available for manipulating 
that state. The architecture must also define how instructions Me selected. 

Specifying an architecture amounts to defining a language. This definition can 
be done in a natural language or in a more formal language; but, still primarily 
tells the programmer how the machine interprets instructions. This section uses 
a combination of natural language and a less ambiguous register transfer language 
(RTL) to describe AVM-1. The description is similar to what one would find in a 
programmer’s manual for a commercial microprocessor. 

The instruction set was inspired by the RISC I instruction set found in Kateve- 
nis [Kat85]. There Me a number of differences, but many features in the RISC I 
instruction set (such as using ALU operations to synthesize a MOVE instruction) 
were incorporated into the AVM-1 instruction set. As we will see in the section 
on organization, however, AVM-1 cannot be called a RISC architecture since its 
microcoded implementation is different than today’s RISC chips. 

One caveat: AVM-1 was not designed to be a showcase for architecture, but 
rather to show that microprocessors with modern features such as privileged modes 
and interrupts could be verified. While one may quibble with the design of AVM-1, 
this in no way affects the usefulness of the example. 


5. 1.1.1 RTL Notation. 

We will use a register transfer language to describe the semantics of the instruction 
set. There Me many register transfer languages in use; the notation and symbols 
for the RTL used in this dissertation Me found in Table 5.1. In general, any capital 
letter refers to a register. We will define the symbols standing for cert sun registers 
later, as the registers Me described. Memory is designated by M. Most of the other 
symbols Me self-explanatory. The keyword status returns the status of the last 
ALU operation, that is the CMry, overflow, negative, and zero flags. 


5. 1.1. 2 The Registers. 

AVM-1 has a load-store Mchitecture based on a lMge register file. The register file 
(denoted R in our RTL) is divided into three portions: 

1. Register 0 which is read-only and contains the constant 0. 
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Table 5.1: Symbols in the Register Transfer Language. 


Symbol 

Meaning 

Example 

letters 

a register 

PSW, PC 

subscripts 

one or more bits in a register 

PSW 4 

R[X] 

register X of the register file 

R[10] 

() 

field in a register 

PSW(l-4) 

M[X1 

location X in memory 

M[PC] 

<!= 

transfer of information 

PC 4= R[3] 

P — ► Opi | Opi 

if p then Opi else Op 2 

PSW 6 (B 4= C) | (B 4= D) 

j 

separates parallel operations 

B 4= C, D 4= E 

+ 

add 

B 4= C + D 

— 

subtract 

B 4= C-D 

V 

logical-OR 

B 4= C V D 

A 

logical-AND 

B 4= C A D 

© 

logi cal-exclusi ve-0 R 

B 4= C © D 


logical-complement 

B 4= -, C 

shl 

logical shift left 

B 4= shl C 

shr 

logical shift right 

B -4= shr C 

asr 

arithmetic shift right 

B 4= asr C 

msb 

most significant bit 

PSW! 4= msb C 

lsb 

least significant bit 

PSWi 4= lsb C 

status 

status of last ALU operation 

PSW(0-3) <4= status 
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Table 5.2: The program status word. 


Bit 

Meaning when set 

0 

Last ALU result was zero 

1 

Last ALU operation caused a carry 

2 

Last ALU result was negative 

3 

Last ALU operation caused a overflow 

4 

Interrupts enabled 

5 

In supervisory mode 


2. Seven supervisor— mode registers including a distinguished register for use as 
the supervisor stack pointer (denoted SSP). The supervisor-mode registers 
are read-only unless the CPU is in supervisor— mode (determined by the 6 
bit in the program status word). 

3. Twenty-four general purpose registers. 

Two additional registers are visible at the architectural level: the program counter 
and the program status word. The program counter (denoted PC) is used to sequence 
the computer — it indicates which instruction in memory to execute next. 

The program status word (denoted PSW) is used to keep track of the status of the 
last ALU operation, whether or not interrupts are enabled, and the privilege level 
of the CPU. Table 5.2 shows the meaning of the 6 bits in the program status word. 

AVM-1 shares a register, IVEC, with the interrupt controller. This register con- 
tains the interrupt vector and is read-only as far as the CPU is concerned. 


5. 1.1.3 The Instruction Set. 

The instruction set contains 30 instructions. The opcode space has room for 64; the 
upper half of the opcode space is reserved for future co-processors. As mentioned 
above, the instruction set is based on a load— store architecture, meaning that most 
instructions are not allowed to access memory for their operands. 


The Instruction Format. The instruction formats are simple and regular. Fig- 
ure 5.1 shows the four instruction formats. All of the formats use the same opcode 
field. 

In formats 1 and 2, the instruction is divided into four fields. The top 6 bits 
(31-26) give the opcode of the instructions. The next 5 bits (25-21) denote the 
destination register in most operations. The third field (bits 20-16) selects the 
register used as the A operand in most operations. In format 1, the fourth field is 
comprised of bits 15-11 and is used to select the register used as the B operand. 
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Format 1: 

31 25 20 15 10 0 


opcode 

dest 

A 

B unused 

Format 2: 

31 25 20 15 0 

opcode 

dest 

A 

immediate 

Format 3: 

31 25 20 0 

opcode 

dest 

unused 

Format 4: 

31 25 0 

opcode 

unused 


Figure 5.1: The instruction formats in AVM-1. 

In format 2, the fourth field uses all of the 16 remaining bits to form an immediate 
number (0 to (2 16 — 1)). 

Format 3 is identical to formats 1 and 2 except that only the opcode and desti- 
nation fields axe used. Format 4 uses only the opcode field. 

There is a trade off between instruction format complexity and verification effort, 
so in general the instruction format should be kept as simple as possible. A regular 
instruction format, while not essential to verification, can greatly reduce the amount 
of detail that has to be dealt with in the proof. 


Instruction Set Semantics. The instruction format is essentially an instruc- 
tion’s syntax. Of course, syntax alone is not enough; we must also specify what 
each instruction means. There sire many ways of specifying the semantics of CPU 
instructions; this dissertation will use two of them. In this section, we give a register 
transfer language description of the instructions in AVM-1. In Section 5.2.6 we give 
a formal description of the semantics of a sample of the instructions; the complete 
description can be found in [Win90b] . 

The 30 programming level instructions are shown in Table 5.3. There is a group 
of eight, 3— argument arithmetic instructions and another group of 8 arithmetic in- 
structions that use a 16-bit immediate value. There are 4 instructions for loading 
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Table 5.3: The AVM-1 instruction set. 


Mnemonic 

Format 

Effect 

JMP 

2 

Jump to new location on condition flags 

CALL 

2 

Call subroutine 

INT 

2 

User interrupt 

RTI 

4 

Return from interrupt 

GPSW 

3 

Get program status word 

PPSW 

3 

Put program status word 

LD 

1 

Load register 

ST 

1 

Store register 

LSL 

1 

Logical shift left 

LSR 

1 

Logical shift right 

ASR 

1 

Arithmetic shift right 

RTN 

3 

Return from subroutine 

LDI 

2 

Load register using immediate value 

STI 

2 

Store register using immediate value 

ADD 

1 

Add 

ADDC 

1 

Add with carry 

SUB 

1 

Subtract 

SUBC 

1 

Subtract with borrow (carry) 

BAND 

1 

Bit-wise conjunction 

BOR 

1 

Bit-wise disjunction 

BXOR 

1 

Bit-wise exclusive disjunction 

BNOT 

1 

Bit-wise negation 

ADD 

1 

Add using immediate value 

ADDC 

1 

Add with carry using immediate value 

SUB 

1 

Subtract using immediate value 

SUBC 

1 

Subtract with borrow using immediate value 

BAND 

1 

Bit-wise conjunction using immediate value 

BOR 

1 

Bit-wise disjunction using immediate value 

BXOR 

1 

Bit-wise exclusive disjunction using immediate value 

NOOP 

4 

No operation 
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Table 5.4: Jump codes for the JMP instruction. 


Code 

Meaning 

0 

carry 

1 

no carry 

2 

overflow 

3 

no overflow 

4 

negative 

5 

positive 

6 

equal 

7 

not equal 

8 

lower or same (unsigned) 

9 

higher (unsigned) 

10 

less than (signed) 

11 

greater or equal (signed) 

12 

greater than (signed) 

13 

greater or equal (signed) 

14 

unconditional 

15 

unconditional 


and storing registers. In addition, there are instructions for performing user inter- 
rupts, jumps, subroutine calls, and shifts. 

The remainder of this section provides detailed descriptions of AVM-l’s instruc- 
tion set. The instructions are specified in our register transfer language and de- 
scribed where appropriate. The RTL specification only describes the part of the 
state that changes; state that is unaffected by the instruction is ignored. In the 
descriptions, a is the value of the A source field in the instruction, b is the value of 
the B source field, d is the value of the destination field, and imm is the immediate 
field value. 


JMP — jump. The JMP instruction jumps on one of 15 different conditions 
according to the value returned from the function jc. The destination field, d, is 
used as an argument to jc to select one of the jump conditions listed in Table 5.4. 
If the result is true, the sum of R[a] and imm is loaded into the program counter. 
Otherwise, the program counter is incremented. 

jc(d) — ► (PC <= R[a] + imm) | (PC •<= PC + 1) 

CALL — call a subroutine. The program counter is loaded with the sum 
of R[a] and imm. The old value of the program counter is saved on a stack in 
memory. The destination field points to the stack pointer. 
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PC <= R[a] + imm, 
R [d] 4= R [d] + 1 , 
M[R[d]] <= PC + 1 


Note that the operations in the above RTL description all happen in parallel (de- 
noted by the comma). Thus M[R[d]] refers to the memory value at the location 
pointed to by the original value of R[d] . 


RTN — return from a subroutine. The top of the stack pointed to by regis- 
ter R [d] is popped and loaded into the program counter. 

PC •<= M[R[d] - 1], 

R[d] R[d] - 1 


IjvjX — user interrupt. The INT instruction jumps to the location given in the 
8 least significant bits of imm and stores the old program counter on the supervisor 
stack pointer. Interrupts are disabled and the CPU goes into supervisory mode. 

PC 4= imm A 255 
Rtssp] <= R[ssp] + 1, 

M[R[ssp]] <= PC + 1, 

PSW 4 <*= false, 

PSW 5 <= true 


RTI — return from interrupt. The program counter gets the value on top of 
the supervisor stack, the value is popped from the top of the stack, interrupts are 
enabled, and the CPU leaves supervisory mode. 


PC M[R[ssp] - 1] , 

R[ssp] <= R[ssp] - 1, 

PSW 4 <t= true, 

PSW 5 •<= false 

GPSW — get program status word. The program status word is stored in 
the register selected by the destination field, R [d] . 

R[d] «*= psv, 

PC <= PC + 1 
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PPSW — put program status word. The register selected by the destination 
field, R[d] , is moved to the program status word if the CPU is in supervisory mode. 

PSW 6 — ► (psv •$= R[d]), 

PC <= PC + 1 


UD load from memory. Register R [d] is loaded with the contents of memory 
at the location given by the sum of registers R[a] and R[b] . 

R[d] ^ M[R[a] + R[b]], 

PC -4= PC + 1 


LDI — load from memory using immediate value. LDI operates exactly 
like LD except that the address is given by the sum of R [a] and imm. 

R[d] -<= M[R[a] + imm] , 

PC <£= PC + 1 


ST — store to memory. The contents of the destination register, R [d] , are 
stored in memory at the address given by the sum of registers R [a] and R [b] . 


M [R [a] + R[b] ] R[d], 

PC «= PC + 1 


STI — store to memory using immediate value. STI operates exactly like ST 
except that the address is given by the sum of R [a] and imm. 

M [R [a] + imm] •$= R [d] , 

PC <$= PC + 1 


I*SL — logical shift left. The destination register, R[d], gets the contents 
of R[a] shifted left one position. The carry field of the program status word gets 
the value of the bit that was shifted out. 

R[d] <= shl R[a] , 

PSWi <= msb R[a] , 

PC <*= PC + 1 
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LSR — logical shift right. The destination register, R[d], gets the contents 
of R[a] shifted right one position. The carry field of the program status word gets 
the value of the bit that was shifted out. 

R [d] 4= shr R [a] , 

PSWi <= lsb R[a] , 

PC 4= PC + 1 


ASR — arithmetic shift right. The destination register, R[d], gets the con- 
tents of R [a] shifted right arithmetically one position. That is, the most significant 
bit is retained in its position during the shift. The carry field of the program status 
word gets the value of the bit that was shifted out. 

R[d] <= asr R[a] , 

PSW X <*= lsb R[a], 

PC <*= PC + 1 

NOOP — no operation. No state changes take place except that the program 
counter is incremented. 


PC «= PC + 1 


ADD — add. The destination register, R[d] , gets the sum of the R[a] and R[b] 
registers. The program status word is updated with the status from the ALU. 


R[d] <= R[a] + R[b] , 
PSW(0-3) <= status, 
PC «= PC + 1 


ADDI — add immediate. The result is identical to that of the ADD instruction 
except that the value of the immediate field is used instead of R[b] . 

R[d] •$= R[a] + imm, 

PSW(0-3) <= status, 

PC <*= PC + 1 

ADDC — add with carry. The destination register, R[d], gets the sum of 
the R[a] and R[b] registers plus the value of the carry bit in the program status 
word. The program status word is updated with the status from the ALU. 
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R[d] <= R[a] + R[b] + PSWi, 
PSW(0-3) <= status, 

PC 4= PC + 1 


ADDCI — add immediate with carry. The result is identical to that of 
the ADDC instruction except that the immediate field is used instead of the R[b] 
register. 


R[d] <*= R[a] + imm + PSWi , 
PSWCO-3) <*= status, 

PC <*= PC + 1 


SUB — subtract. The destination register, R[d], gets the value produced by 
subtracting R[b] from R[a]. The program status word is updated with the status 
from the ALU. 


R[d] «= R[a] - R[b], 
PSW(0-3) ■<= status, 
PC 4= PC + 1 


SUBI — subtract immediate. The result is identical to that of the SUB in- 
struction except that the immediate field is used instead of the R[b] register. 

R[d] <*= R[a] - imm, 

PSW(0-3) 4= status, 

PC <*= PC + 1 


SUBC — subtract with borrow (carry). The destination register, R[d] , gets 
the value produced by subtracting the contents of the R[b] register and the value 
of the carry bit from the contents of the R[a] register. The program status word is 
updated with the status from the ALU. 


RCd] <<= R[a] - R[b] - PSWj, 
PSW(0-3) <= status, 

PC <*= PC + 1 


SUBCI — subtract immediate with borrow. The result is identical to that of 
the SUBC instruction except that the immediate field is used instead of the contents 
of the R[b] register. 
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R[d] <*= R[a] - imm - PSWx , 
PSW(0-3) 4= status, 

PC <= PC + 1 


BAND — bit-wise conjunction. The destination register, R [d] , gets the value 
produced by taking the bit-wise conjunction of the contents of the R[a] register 
with the contents of the R [b] register. The negative and zero flags in the program 
status word are updated with the status from the ALU. 

R[d] <= R[a] A R[b] , 

PSW(0 ,2) <= status, 

PC «= PC + 1 


BANDI — bit— wise conjunction with immediate. The result is identical to 
that of the BAND instruction except that the immediate field is used instead of the 
contents of the R [b] register. 

R[d] •$= R[a] A imm, 

PSW(0,2) <= status, 

PC <= PC ♦ 1 

BOR — bit-wise disjunction. The destination register, R[d], gets the value 
produced by taking the bit-wise disjunction of the contents of the R[a] register 
with the contents of the R [b] register. The negative and zero flags in the program 
status word are updated with the status from the ALU. 

R[d] <= R[a] V R[b] , 

PSW(0,2) •$= status, 

PC 4= PC ♦ 1 

BORI — bit-wise disjunction with immediate. The result is identical to 
that of the BOR instruction except that the immediate field is used instead of the 
contents of the R [b] register. 

R[d] <*= R[a] V imm, 

PSW(0 ,2) <*= status, 

PC <*= PC + 1 

BXOR — bit-wise exclusive disjunction. The destination register, R[d], 
gets the value produced by talking the bit-wise exclusive disjunction of the contents 
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of the R[a] register with the contents of the R[b] register. The negative and zero 
flags in the program status word are updated with the status from the ALU. 

R[d] 4= R[a] ® R[b] , 

PSW(0,2) 4= status, 

PC 4= PC + 1 


BXORI — bit —wise exclusive disjunction with immediate. The result is 
identical to that of the BXOR instruction except that the immediate field is used 
instead of the contents of the R[b] register. 


R[d] R[a] ® imm, 
PSW(0,2) 4= status, 
PC 4= PC + 1 


BNOT — bit-wise conjunction. The destination register, R [d] , gets the value 
produced by taking the bit-wise negation of the contents of the R[a] register. The 
negative and zero flags in the program status word are updated with the status 
from the ALU. 

R[d] 4= -i R[a] , 

PSW(0,2) 4= status, 

PC <= PC + 1 


Synthesizing Addressing Modes. Besides the CALL and INT instructions which 
must access a stack, only the load and store instructions can access memory. All 
of the other instructions only operate on the internal registers. This makes the 
implementation of the instruction set easier and results in faster operation of most 
of the instructions. 

The addresses for the load and store instructions are calculated using the sum 
of two numbers: a register and either a register or an immediate value. This is a 
flexible scheme which allows most popular addressing modes to be synthesized. 

Table 5.5 (adapted from [Kat85]) shows how the memory addressing scheme 
in AVM-1 can be used to support common constructs in modern high-level lan- 
guages. 

• In direct mode, the A register holds the base of the data segment and the 
immediate value allows addressing within ±2 18 of the base. 

• In indirect mode, the A register holds the value of the pointer. R[0] holds 
the constant 0. 
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Table 5.5: Synthesizing addressing modes using AVM-l's 
load and store instructions. 


Mode 

HLL Usage 

Synthesizing in AVM-1 

Direct 

Indirect 

Indexed 

Indexed 

Global Scalar 
Pointer Dereferencing 
Record Field 
Array Element 

M[R[a] + imm] 
M[R[A] + R[0]] 
M[R[a] + imm] 
M[R[a] + R[b}\ 


Table 5.6: Synthesizing instructions using AVM-l's instruc- 
tion set. 


Instructions 

Synthesizing in AVM-1 

Move s to d 
Clear d 
Set bit x in s 
Clear bit x in s 
Test s 

Increment s 
Decrement s 
Complement s 

ADD R[d] R[s] R[0] 

ADD R[d] R[0] R[0] 

BORI R[s] R[s] 2< I+1 > 
BANDI R[s] R[s] 2 16 - 2( I+1 > 
ADD R[0] R[s] R[0] 

ADDI R[s] R[s] 1 
SUBI R[s] R[s] 1 
SUB R[s] R[0] R[s] 


• To perform memory operations on records, the A register holds the base ad- 
dress of the record and the immediate field is used to hold the field offsets 
into the record. 

• Array operations are performed by using the A register to hold the base ad- 
dress of the array and the B register hold the index. 


Synthesizing Other Instructions. Even though the instruction set of AVM-1 
is quite simple, many common instructions can be synthesized using only one in- 
struction. For example, a move instruction can be synthesized by adding the register 
to be moved to R[0] which always contains 0. Table 5.6 shows the implementation 
of this and other instructions. 

The idea behind the simple instruction set of AVM-1 is to implement the opera- 
tions that are used frequently in hardware and synthesize operations that are used 
less frequently by composing simple operations. For example, the memory address- 
ing scheme allows the implementation of the common stack operations in just a few 
primitive instructions. Another example is clearing or setting a bit in the program 
status word. This takes at least three operations and a temporary register. This is 
acceptable, however, since toggling program status word bits is an operation that 
occurs much less frequently than the operations that are built into the instruction 
set. 
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Table 5.7: Opcode breakdowns for AVM-l’s instruction set. 



ooxxx 

01XXX 

10XXX 

11XXX 

000 

JMP 

LSL 

ADD 

ADDI 

001 

CALL 

LSR 

ADDC 

ADDCI 

010 

INT 

ASR 

SUB 

SUBI 

Oil 

RTI 

RTN 

SUBC 

SUBCI 

100 

GPSW 

NOOP 

BAND 

BANDI 

101 

PPSW 

NOOP 

BOR 

BORI 

110 

LD 

LDI 

BXOR 

BXORI 

111 

ST 

STI 

BNOT 

NOOP 


5. 1.1.' 4 Selecting Instructions. 

We select instructions in the instruction set using the opcode portion of the word 
in memory pointed to by the current value of the program counter. We will only 
use the 5 least significant bits of the opcode field, allowing 32 instructions. 

Table 5.7 gives a breakdown of the opcodes for AVM-1. The instruction set is 
divided into four groups depending on the value of the first 2 bits in the opcode. The 
first two groups contain miscellaneous instructions, the third group contains ALU 
operations and the fourth group contains the immediate version of the instructions 
in group 3. 


5.1.2 An Organizational View. 

A computer’s organization is its structure — what components are used and to what 
effect. An organization must define the behavior of the components and how they 
are connected together. Abstractly, the goal of the organization is to implement 
a particular architecture; but, there may be system requirements not expressed at 
the architectural level (such as the memory interface) that are specified and met at 
the organizational level. 

There are many ways of describing a computer organization. Circuit diagrams, 
computer programs, natural language, CAD tools, and mixtures of all of these have 
been used. This section will describe the implementation of AVM-1 using circuit 
diagrams, pictures, and natural language descriptions. 

The implementation of AVM-1 can be divided into two major parts: the datapath 
and the control unit. We will discuss each of these. 
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Table 5.8: Implementation of the jump codes for the JMP 
instruction, cf is the carry flag in the PSW, zf 
is the zero flag, etc. 


Code 

Implementation 

0 

cf 

1 

->cf 

2 

vf 

3 

-■vf 

4 

nf 

5 

->nf 

6 

zf 

7 

-»zf 

8 

(->cf Vzf) 

9 

“»(“ «cf Vzf) 

10 

(nf xor vf) 

11 

->(nf ©vf) 

12 

->((nf ©vf) Vzf) 

13 

((nf ©vf) Vzf) 

14 

true 

15 

true 


5. 1.2.1 The AVM-1 Datapath. 

The AVM-1 datapath is loosely based on the AMD 2903 bit-sliced datapath [Adv83] 
and is shown in Figure 5.2. The signals shown at the right-hand side of the fig- 
ure connect to the control unit. The signals on the left go to or come from the 
environment. Note that none of the clocking signals are shown. 

The datapath has three buses, a register file containing 32 registers, and numerous 
support registers and latches. Two buses, A and B, are connected to the output ports 
on the register file and system registers. The C bus is connected to the input port on 
the register file and the system registers. In addition, the interrupt vector register 
is attached to the B bus through a special port to the interrupt controller. 

The A and B buses feed the inputs to the ALU through two latches. The memory 
buffer register can also serve as the A input to the ALU through a multiplexor on 
the ALU input. The ALU performs simple arithmetic and boolean operations on 
the values on its A and B inputs. The results of the ALU operation are fed to the 
shifter which can perform logical and arithmetic shifts. The result from the shifter 
is put onto the C bus for distribution. 

In addition to a result, the ALU produces a set of status bits (negative, zero, 
carry, and overflow) which can be saved in the program status word directly. A 
one-bit multiplexor also allows the bit shifted out of the shifter to be saved in the 
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carry field of the PSW. The control lines to the PSW allow the supervisor and 
interrupt enable bits to be set and cleared and each of the status bits to be loaded 
individually. 

The status from the PSW and the destination field of the instruction register are 
fed into the jump code circuitry. This combinatorial circuit calculates the jump 
conditions shown in Table 5.8 and supplies a boolean result which is used to deter- 
mine if the program counter should be loaded from the C bus. The program counter 
can also be loaded unconditionally. 

The instruction register can be loaded from the C bus, but only the immediate 
portion of the instruction register can be placed on the B bus. 

The memory address register can be loaded directly from the program counter or 
from the C bus. This allows the MAR to be loaded quickly for instruction fetches 
while still allowing calculated addresses for loads and stores. 

The datapath has two flipflops for holding the status of interrupt actions and 
three demultiplexors for decoding register selection signals from the control unit. 


5. 1.2. 2 The Control Unit. 

The control unit for AVM-1 is shown in Figure 5.3. The control unit has four major 
blocks: the microprogram counter, the microinstruction register, the clock, and the 
microrom. 

The microprogram counter is the most complex of the four. The purpose of 
the microprogram counter is to compute the next address for the microprogram 
based on the current system state. The microprogram counter is fed the condition 
and address (addr) fields from the microinstruction register, the opcode from the 
instruction register, and the supervisory and interrupt enable bits from the program 
status word. There are 5 jump conditions: 

1. No jump; the microprogram counter is incremented. This is the default oper- 
ation. 

2. Jump to addr unconditionally 

3. Jump to the location given by the opcode signal and an offset (4 in this case). 
This allows us to use a table lookup approach to instruction decoding in the 
microcode. We only use the 5 least significant bits of the 6-bit opcode; the 
top half of the instruction set is reserved for a coprocessor. 

4. Jump to addr if the interrupt signal is true and interrupts are enabled. 

5. Jump to addr if the supervisory mode signal is true. 
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The microinstruction register is a 40-bit register that holds the current microin- 
struction. The only special feature of the register is that each of the fields from 
the microinstruction are available through separate ports for use elsewhere in the 
control unit and datapath. 

The microinstruction format is shown in Table 5.9. A microinstruction consists 
of 40 bits in 24 fields. The fields in a microinstruction can be broken into 4 groups: 
those affecting the operation of the microprocessor, those affecting the program 
status word, those dealing with external signals, and those that are used for mi- 
croinstruction sequencing. 

The operational group consists of the following fields: 

• AMUX - If set, the A-latch (feeding the ALU) is loaded from the memory 
buffer register, otherwise the A-latch is loaded from the A-bus. 

• SHFT - This field is passed unchanged to the shifter where it is used to select 
the shifter operation. 

• ALU - This field is passed unchanged to the ALU where it is used to select 
the ALU operation. 

• MAR - If high, the MAR is loaded with the value on the output port of the 
PMUX. 

• MBR - If high, the MBR is loaded from the C-Bus 

• PMUX - Determines the value of the PMUX output. If high, the output is 
equal to the value in the program counter, otherwise the output is equal to 
the value on the C-Bus. 

• SRCA - Determines the source of the value on the A-Bus. 

• SRCB - Determines the source of the value on the B-Bus. 

• TRGT - Selects a register in which to store the value on the C-Bus. 

The program status word group consists of the following fields: 


• S_SM - When high, the supervisory mode bit in the PSW is set. 

• C_SM - When high, the supervisory mode bit in the PSW is cleared. 

• S_IE - When high, the interrupt enable bit in the PSW is set. 

• C _JE - When high, the interrupt enable bit in the PSW is cleared. 

• LD_C - When high, the carry bit in the PSW is loaded from the carry-bit 
input port. 
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Table 5.9: The microinstruction format for AVM-1. 


Operation Group 


Bits 

Mnemonic 

Description 

1 

AMUX 

Toggle MUX on A-bus 

2 

SHFT 

Shifter function 

4 

ALU 

ALU function 

1 

MAR 

Load MAR from P-Mux 

1 

MBR 

Load MBR from C-bus 

1 

PMUX 

Toggle MUX loading MAR 

3 

SRCA 

A-bus source 

2 

SRCB 

B-bus source 

3 

TRGT 

C-bus target 


Program Status Word Group 


Bits 

Mnemonic 

Description 

1 

S_SM 

Set supervisory mode bit in PSW 

1 

C_SM 

Clear supervisory mode bit in PSW 

1 

S_IE 

Set interrupt enable bit in PSW 

1 

C_IE 

Clear interrupt enable bit in PSW 

1 

LD_C ^ 

Load carry bit in PSW 

1 

LD-V 

Load overflow bit in PSW 

1 

LD_N 

Load negative bit in PSW 

1 

LD.Z 

Load zero bit in PSW 

1 

CSRC 

Source of carry (shifter or alu) 


External Signals Group 


Bits 

Mnemonic 

Description 

1 

IACK 

Interrupt acknowledge signal 

1 

FTCH 

Fetch signal 

1 

RD 

Read signal 

1 

WR 

Write signal 


Microprogram Counter Group 


Bits 

Mnemonic 

Description 

3 

COND 

Microcode jump condition 

6 

ADDR 

Next address 







• LD_V - When high, the overflow bit in the PSW is loaded from the overflow- 
bit input port. 

• LD_N - When high, the negative bit in the PSW is loaded from the negative- 
bit input port. 

• LD_Z - When high, the zero bit in the PSW is loaded from the zero-bit input 
port. 

• CSRC — The ALU and shifter both produce a carry out. This bit controls 
a multiplexor that selects which of these carry signals is fed to the carry-bit 
input port on the PSW. 

The external signals group consists of the following fields: 

• IACK - This value is passed to the interrupt acknowledge flipflop to control 
the external interrupt acknowledge signal. 

• FTCH - Passed to the environment to inform external devices that the CPU 
is in fetch mode. 

• RD — Used to control loading of the MAR and MBR. It is also passed to the 
environment to control reading from memory and other devices. 

• WR - Used to control loading of the MAR and MBR. It is also passed to the 
environment to control writing to memory and other devices. 


The microprogram counter group consists of the following fields: 

• COND — Selects one of 8 possible jump conditions for the microprogram 
counter. Every microinstruction is a potential control point in the micropro- 
gram. Sequencing is done explicitly. 

• ADDR - The next address for the microprogram counter. This may or may 
not be used depending on the value of the cond field. 

The clock is a simple four-phase counter with a strobe line for each phase. Fig- 
ure 5.4 shows the output timing for the clock. The clkl line, for example, is only 
true during phase 1, the clk2 line is true during phase 2, and so on. 

The microrom holds the microcode and is made from a read-only memory that 
is 40-bits wide and 64 words long. 
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clkl 


clk2 


clk3 


clk4 


1 



Figure 5.4: The clock signals in AVM-1. 



Figure 5.5: A PERT phase diagram for AVM-1. 

5. 1.2. 3 Timing. 

The timing of AVM-1 is based on a four phase dock (see Figure 5.5). During the 
four phases, the machine performs the following state transitions: 

1. In phase 1, the microinstruction register is loaded from the microrom. 

2. In phase 2, the latches feeding ALU are loaded from the register file and 
system registers. 

3. In phase 3, the results from the ALU and shifter sure calculated. In addition, 
the MAR can be loaded from the PC in this phase. 
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Table 5.10: Comparison of verified microprocessors and AVM-1. 



AVM-1 

Tamarack-3 

FM8501 

Viper 

SECD 

User Registers 

31 

2 

8 

4 

4 

Instructions 

30 

8 

26 

20 

21 

Microcoded 

yes 

yes 

yes 

no 

yes 

Microstore size 

64 words 

32 words 

16 words 

N/A 

512 words 

Interrupts 

yes 

yes 

no 

no 

no 

Supervisory Mode 

yes 

no 

no 

no 

no 

Memory Model 

sync 

async 

async 

sync 

sync 

Word Width 

32-bit 

16- bit 

16-bit 

32-bit 

32-bit 

Memory Size 

4G 

8K 

64K 

1M 

16K 


4. In phase 4, the result calculated in phase 3 is stored back into the register file 
and system registers. 

Every microinstruction is executed by this phase sequence. 

Since microinstructions axe used to implement the macroinstructions, the tim- 
ing for a macroinstruction is dependent on the number of microinstruction in its 
implementation. In most cases this number is 4. 


5.1.3 Comparisons. 

Table 5.10 compares the design of AVM-1 to the designs of the four microprocessors 
discussed in Chapter 2. The table, like all such tabulations, cannot hope to capture 
all of the important characteristics of the microprocessors, but the data presented 
does provide some basis for judging relative complexities. 


5.1.4 Observations. 

Having completed the description of AVM-1 ' s architecture and organization, we 
have several observations. 

The design of AVM-1 is not intended to push the architectural envelope, but 
rather to serve as a test bed for experimenting with using generic interpreter proofs 
in microprocessor verification. To this end, we have tried to include interesting 
features (such as a privileged mode and interrupts), but have not been overly anxious 
about small inefficiencies. 

For example, the implementation of AVM-1 is not optimal. A good example of 
where the implementation could be improved is the FETCH — ISSUE — DECODE cycle. 
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The only purpose of the ISSUE microinstruction is to transfer the contents of the 
memory buffer register to the instruction register. Being able to read the memory 
bus directly into the instruction register during the FETCH cycle would eliminate 
this step and result in nearly a 25% speed-up in the execution time since almost 
every instruction is implemented in 4 microinstructions. Making this modification 
to the design of AVM-1 would have very little impact on the verification. 

The implementation makes the assumption that memory can be read in a single 
machine cycle. This is not unreasonable given the speed of today’s high-speed 
memory devices, but limits the usefulness of the chip. A more versatile approach 
would be to interface memory to the CPU asynchronously. Eventually, AVM-1 will 
have an asynchronous memory interface so that it can be coupled with the memory 
management unit being designed as part of the UC Davis Verified Chip Set. In 
anticipation of this change to the design, the specification in the next section uses 
the asynchronous generic interpreter theory. 
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5.2 AVM-Ps Formal Specification. 


Section 5.1 presented an informal description of AVM-1. This section presents 
the formal specification of the microprocessor at each level in the decomposition 
hierarchy. We begin by describing the electronic block model and then present the 
phase-level specification, the micro-level specification, and finally the macro-level 
specification. 

Turning an informal description of a microprocessor into a formal specification 
is a difficult task. Avra Cohn, in [Coh89], describes her specification of VIPER’s 
electronic block model from informal descriptions supplied by VIPER’s designers 
as follows: 

. . . VIPER’s top-level specification and its major-state level were both 
supplied in a logical language; but its block-level model was given partly 
formally and partly pictorially (as was natural). Combining these two 
paxts required both ingenuity and some guesswork. The guesses were 
based on the coincidence of line names, on the names of bound variables 
in the functional definitions, and on the annotations in the text of the 
definitions. None of these notational devices can be regarded as formal 
specification. 

This quote not only tells of the difficulties of developing formal specifications from 
the kinds of informal descriptions commonly in use, but also alludes to the inad- 
equacies of those descriptions. The formal specification of AVM-1 was probably 
easier than VIPER since the designer and the specifier were the same person. 

The rest of this section is organized as follows: We begin by describing the theory 
of abstract words that is used in the specification of AVM-1. Following that, we 
present the specifications of the electronic block model, phase-level, micro-level, 
and macro-level in turn. We also describe the definition of the microcode. There 
is a fair amount of detail and it is easy to get lost. Each of the sections describing 
a particular level have been further divided into important subparts. 

The electronic block model specification is unique. The electronic block model 
is a composition of two large blocks: the datapath and the control unit. Within 
these two blocks are many major blocks. Each of the major blocks Eire described 
in a sepEirate subpEut of the section. Many of these can be skipped by readers not 
interested in the details without losing continuity. 

The descriptions of the abstract interpreter levels all follow the pattern imposed 
by the generic theory. The generic theory requires that we msdce definitions for each 
of the abstract objects in the representation; the following abstract objects will be 
defined in each section: inst.list, select, key, substate, and subenv. We will 
break each chapter into parts defining each of these abstract objects. (Note that we 
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do not have to define Impl, count, and begin; they are defined by the lower level 
in the hierarchy.) 

One note about the following descriptions: this section attempts to describe the 
meaning of the formal specifications in English text. In all cases, the true meaning 
should be taken from the logic, not the English description accompanying it. 


5.2.1 A Theory of Abstract Words. 

The specification of any microprocessor is based upon the fundamental data type 
that the microprocessor is to manipulate and a set of primitive operations on 
that data type. Usually, the data type is a bit-vector and the primitive opera- 
tions define addition, subtraction, and so on for bit-vectors. Sometimes a single 
specification may use more than one representation. For example, the verification 
of MACS [Win90a] used natural numbers as the base type in the abstract repre- 
sentations and a bit-vector representation in the electronic block model. 

The verification of the microprocessor is orthogonal to the concrete representation 
of the fundamental data type. Using concrete data representations for defining the 
fundamental data type clutters the proof with the implementation details of the 
data type; these are frequently a bother to manipulate and usually irrelevant in the 
correctness proof. 

We can solve this problem by choosing an abstract representation for the funda- 
mental data type. Our abstract data type is called : *vordn and we have defined a 
number of abstract operations on it. 

The fact that there are two abstract representations used in this dissertation might 
be a point for some confusion. The generic interpreter theory uses an abstract rep- 
resentation to specify the operations of the generic interpreter. This representation 
is instantiated with the definitions for the various levels in the decomposition in the 
course of completing the verification. 

The definitions for the various levels in the design are also parameterized over the 
abstract representation for the fundamental data type for AVM-1. Thus, the cor- 
rectness result for the microprocessor forms yet another generic theory. The generic 
theory for the microprocessor must be instantiated with a concrete representation 
for bit-vectors in order to arrive at the gate-level implementation of the electronic 
block model and complete the implementation. 

The abstract theory of n-bit words defines the following abstract objects through 
use: 

• *wordn - the type for n-bit words. 

• ^memory - the type for memories. 


88 



• *address - the type for memory addresses. 

• *reg_len - the type for bit-vectors used to select registers. 

The operations in the abstract theory form the set of primitive operations for 
defining the blocks in the electronic block model and specifying the actions taken 
by instructions at all levels. Some may object to using abstract operations for 
defining the behavior of the macro-level as well as electronic block model. This is 
really no different, however, them using the + symbol at both levels. The fact that 
one operation has a concrete definition and the other does not, makes no difference. 
In fact, the concrete definition attached to the -f symbol may fool the reader of the 
specification into believing that the microprocessor has been proven to correctly 
add, when in fact, it has not. The use of abstract representations for this purpose 
makes it clear which operations are taken as primitive and consequently not verified. 

The abstract representation for n-bit words is large and contains several sections. 
We will deal with each of them individually. 


ALU Functions. The n-bit word theory defines the following abstract functions 
for defining ALU operations: 

• ( ‘ add' , " : (*wordn x *wordn — * *vordn) ") - add two n-bit words. 

• ( ‘addc * , " : (*wordn x *wordn x bool — ► *wordn") — add two n-bit words 
with carry. 

• ( ‘addp‘ , " : (*wordn x *wordn x *wordn) — *■ bool") - predicate that uses 
the arguments to and result from the add operation to determine if carry-out 
has occurred. 

• (‘addcp‘ (*vordn x*vordn x*wordn) — * bool") - determine if carry- 
out has occurred using the arguments to and result from the addc operation. 

• (‘aovfl' ," : (*wordn x *wordn x *vordn) — ► bool") - determine if over- 
flow has occurred using the arguments to and result from the add and addc 
operations. 

• (‘inc‘, ":(*vordn — ► *vordn) ")- increment an n-bit word. 

• (‘sub‘, ":(*wordn x *wordn — ► *vordn)") - subtract two n-bit words. 

• (‘subc‘, ":(*vordn x *vordn x bool) — ► *wordn") - subtract two n- 
bit words with carry (borrow). 

• (‘subp‘ : (*vordn x *wordn x *wordn) — *• bool") - predicate that uses 
the arguments to and result from the sub and subc operations to determine 
if carry-out has occurred. 
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• (‘sovfl' (*wordn x *wordn x *vordn) — ► bool") -determine if over- 
flow has occurred using the arguments to and result from the sub and subc 
operations. 

• (‘dec', ":(*wordn — ► *vordn)") - decrement an n-bit word. 

• (‘band*, ":(*vordn x *wordn — ► *wordn)") - perform bitwise conjunc- 
tion of two n-bit words. 

• (‘bxor‘, ":(*vordn x *vordn — ► *vordn)") - perform bitwise exclusive- 
disjunction of two n-bit words. 

• (‘bor‘, ":(*wordn x *wordn — ► *wordn) ")- perform bitwise disjunction 
of two n-bit words. 

• (‘bnot‘ , (*wordn — * *wordn)") - perform bitwise negation of an n-bit 
word. 


Test functions. In addition to the operations used to define the ALU operations, 
two predicates for testing whether a number is negative and whether it is zero are 
used in the specification. 

• (‘negp‘, ":(*wordn — » bool)") - is the argument negative? 

• (‘zerop‘, " : (*wordn — ► bool) ") - is the argument zero? 


SHIFTER functions. The shifter has a set of primitive operations as well: 

• (‘shl‘ , (*wordn — ► *vordn)") - shift the argument left one bit. 

• (‘shr‘ , " : (*wordn — ► *vordn)") - shift the argument right one bit. 

• (‘asr‘ , " : (*wordn — > *wordn)") - arithmetically shift the argument right 

one bit (i.e. preserve the sign bit). 


Bit functions. We do not need a full range of bit manipulation functions in the 
specification, but we do need to select the most significant and least significant bits. 

• (‘msb‘, ":(*vordn — ► bool)") - select the most significant bit in the ar- 
gument. 

• (‘lsb‘, ":(*wordn — * bool)") - select the least significant bit in the ar- 
gument. 
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Coercion functions. Coercion functions convert objects from one type to an- 
other. 

• ( ‘ val ‘ , " : (*wordn — ► num) " ) - returns the numeric value of an n-bit word. 

• (‘vordn‘, ":(num — * *vordn)") - return the n-bit word representation of 
number. 

• (‘reg_len‘, ":(*reg_len — > num)") - coerces a value of type :*reg_len 
to a number. 

• (‘address', ":(*wordn — * *address)") - return the address representa- 
tion of an n-bit word. 

The use of type ^address gives the user of the abstract word representation the 
freedom to use only portions of a word for am address or to manipulate them in 
some way prior to use. 


Subranging functions. Subranging functions return a portion of an n-bit word 
corresponding to some meaningful component. The following functions are used to 
implement the instruction formats in AVM-1. 

• (‘opcode' , (*vordn — » bt6)") - return the opcode portion of an n-bit 
word which is represented as a boolean 6-tuple. 

• (‘dest ‘ , " : (*wordn — ♦ *reg_len) ") — return the portion of an n-bit word 
designating the destination register of an operation. 

• (‘srca‘, ":(*wordn — ♦ *reg_len) ") - return the portion of an n-bit word 
designating the source A register of the operation. 

• (‘srcb‘ , " : (*wordn — > *reg_len)") - return the portion of an n-bit word 
designating the source B register of the operation. 

• ( ‘ imm‘ , " : (*wordn — ► *wordn) ") - return the portion of an n-bit word des- 
ignating the immediate value used in the operation. 


The use of type : *reg_len to describe the size of the sub-word designating registers 
makes the proof independent of the size of the register file. The opcode, however, is 
returned as a boolean 6-tuple. Making it concrete has advantages in the verification. 
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Constructor and selectors for the Program Status Word. The program 
status word is a register that keeps track of the status of the most recent ALU 
operation as well as recording whether or not the CPU is in supervisory mode 
and whether or not interrupts are enabled. The following operations represent a 
constructor and 6 selectors on the program status word. 

• (‘nk_psw‘ , " : (bt6 — ► *wordn) ") - construct a new program status word. 

• (‘get_ie‘, ":(*vordn — ► bool)") - select the interrupt enable bit in the 
program status word. 

• (‘get_sm‘ , " : (*vordn — ► bool)") - select the supervisory mode bit in the 
program status word. 

• (‘get_cf‘, ":(*wordn — ► bool)") - select the carry bit in the program 
status word. 

• (*get_vf ‘ , " : (*wordn — * bool)") - select the overflow bit in the program 
status word. 

• (‘get_zf‘, ":(*vordn — ► bool) ")- select the zero bit in the program sta- 
tus word. 

• (‘get_nf‘, " : (*wordn — » bool) ")- select the negative bit in the program 
status word. 


Memory functions. We need special functions for interacting with memory be- 
cause it represents shared state. The CPU cannot assume that it is the only device 
that changes memory. The fetch and store operation axe fairly self-explanatory. 
The use of the abstract transformation functions is described in Section 3.4. 

• (‘fetch 1 (*memory x *address) — > *wordn") - retrieve a word from 
memory at a particular address. 

• (‘store' (*memory x *address x *wordn)— »*memory") - store a word 
to memory at a particular address. 

• (‘trans' ,":*memory — * *memory") - transform memory. 


Interrupt instructions. The interrupt vector is another example of shared state. 
We will use the following functions to interact with the interrupt vector. 

• (‘int_fetch‘ , ":*vordn — ♦ *wordn") - fetch the interrupt vector 

• ( ‘ int.trans ‘ , ":*wordn — > *vordn") - transform the interrupt vector 
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5.2.2 Defining the Electronic Block Model. 


As we mentioned before, the electronic block model is a structural description and 
is modeled using existentionally linked conjunctions of predicates as described in 
Section 2.4. We choose blocks, define predicates to specify their behavior, connect 
them together using hidden internal lines, and connect the remaining lines to the 
external buses. 

We have some leeway in choosing the blocks. Each block will be specified using 
a behavioral description. We will not continue the proof below the electronic block 
model level in this dissertation; to completely verify the circuit making up the CPU 
to the gate-level, we would have to specify implementations for each of the blocks 
and prove that the implementation implies the behavioral specification. These 
proofs could be used along with the proof we give here to prove a correctness 
statement showing that the gate-level circuit implies the macro-level interpreter 
specification. 

The level to which the proof should be performed is a subjective consideration. 
We could carry the proof to the transistor level, but there is a point where the 
benefits of the proof are outweighed by its difficulty. For example, we could expend 
effort showing that all the gates we use axe correctly implemented in some transistor 
model, but such effort would probably be wasted since a verification is only as good 
as the model used in the specification. Given the current state-of-the-art in the 
mathematical modeling of transistors, it is probably more reasonable to assume 
that an AND gate is correctly implemented than it is to assume we have a good 
transistor model. 


The Datapath Blocks. Some of the blocks in the datapath are fairly small (a 
flip-flop for instance) and others are fairly large (the ALU is specified as a single 
block). Still, we believe that our block model is a good compromise between circuit 
detail and proof effort. Much more detail in the electronic block model would have 
made the verification of the phase-level even more difficult. Much less detail would 
have made the verification of the phase-level trivial. As a general rule of thumb, we 
have tried to keep our blocks simple enough that there would be little doubt that a 
device could be made which satisfies the specification. 
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Simple Blocks. We begin by defining some simple blocks. 


hj e f GND out * (out « F) 

b*/ MUX.SPEC ctl a b c « (c = (ctl -> a I b)) 

\- dtf MUX.l.SPEC ctl a b c = (c = (ctl -►alb)) 

b*/ LATCH.SPEC i Id out ■ * 

V t:time . out(t+l) * Id t — ► i t I out t 


\- dtf FF.SPEC i Id q = 

V t:num . q(t+l) = ((Id t) — ♦ i I (q t)) 

b & t j REG.SPEC i Id prt out contents = 

V t:time . 

(contents (t+1) = Id t — ♦ i t I contents t) A 
(prt t => (out * contents)) 

b i e f C255_SPEC rep prt out = 

prt => (out * (wordn rep 255)) 


GND is the ground line. Its output is always false. MUX_SPEC is a simple n-bit, two- 
to-one multiplexor. MUX_1_SPEC is a 1-bit multiplexor; it is identical to MUX_SPEC 
except for its type (which is not shown). 

FF_SPEC and LATCH-SPEC specify a flip-flop and a latch respectively. The only 
difference between the two specifications is that FF-SPEC operates on a single bit, 
while REG-SPEC operates on n-bit words. 

REG-SPEC specifies a register. For our purposes, the difference between a register 
and a latch is that a register has a tri-stated output port (controlled by the signal 
prt). 

C255-SPEC specifies a hard-wired constant that is tri-stated to the port out. 
In this case, the constant is 255. The function wordn is from the abstract word 
package that was just discussed. The abstract functions must be applied to an 
abstract representation, so wordn rep returns a function that coerces an integer 
into an n-bit word. 


The Register Block. The register block is a triple-ported register file with 
32 registers. (The formal specification does not actually say how many registers 
there are until the abstract word package has been instantiated with a concrete 
representation.) The basic operation in the register block is described by the func- 
tion UPDATE-REG 
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\~itf UPDATE.REG rep psw n reg.list value = 
let sin = (get.sm rep psw) in 
(n = zero.reg) — » reg.list I 

(IS.SUP.REG n A ->sm) -> reg_list I 

(SET.EL n reg.list value) 


This function is used to update the list representing the register file when certain 
conditions have been met. Register 0 is has a constant value of 0 which cannot be 
changed. Registers 1 through 7 are reserved for privileged mode; they cannot be 
changed unless the supervisory mode bit of the program status word is set. The 
function SETJEL changes the value of the n th element of a list. 

In general, the register file reads values on the in port and write values to the 
outA and outB ports. In addition to the three ports, there are two lines that control 
loading the register from the input port, four lines controlling the two output ports, 
and three switch lines of type : *reg_len that select registers in the block for various 
reasons. There is one distinguished register in the register file used as the stack 
pointer when the CPU is in supervisory mode called ssp_reg. 


hj e / REGISTER.BLOCK rep c a b Id ld_ssp prt.A prt.D ssp prt.B 

in outA outB psw reg.list = 

V t:time . 

(reg.list (t+1) = 

(Id t) -» 

(UPDATE.REG rep (psw t) (reg.len rep (c t)) 

(reg.list t) (in t)) I 

(ld.ssp t) -+ 

(UPDATE.REG rep (psw t) ssp.reg 

(reg.list t) (in t)) I 

(reg.list t)) A 

(prt.A t ==> (outA t ■ (EL (reg.len rep (a t)) (reg.list t)))) A 

(prt.D t => (outA t * (EL (reg.len rep (c t)) (reg.list t)))) A 

(ssp t => (outA t = (SSP.REG (reg.list t)))) A 

(prt.B t => (outB t = (EL (reg.len rep (b t)) (reg.list t)))) 


The register file is designated reg_list in the specification and is represented as a 
list. The list indexing function EL is used to select specific registers. The register 
block operates as follows: 

• When Id is high, the register selected by the c line is updated with the value 
on the input port in. 

• When ld_ssp is high, ssp_reg is updated with the value on the input port in. 

• When prt_A is high, outA has the value of the register selected by the value 
on the a line. 
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• When prt_D is high, outA has the value of the register selected by the value 
on the c line. 

• When ssp is high, outA has the value of ssp_xeg. 

• When prt_B is high, outB has the value of the register selected by the value 
on the b line. 


The Instruction Register. The instruction register is similar to the regis- 
ter defined in REGJSPEC; but it has four additional ports that supply the opcode, 
destination, A source, and B source fields from the register. 


b*/ IR.SPEC rep set prt in out contents 

opc.port dest.port srca.port srcb.port * 

V t:tioe. 

(contents (t+1) * (set t) -* in t I contents t) A 
(opc.port t - opcode rep (contents t)) A 

(dest_port t = dest rep (contents t)) A 

(srca_port t = srca rep (contents t)) A 

(srcb.port t = srcb rep (contents t)) A 

(prt t => (out t * (imm rep (contents t)))) 


The value on the output port (when the port line is high) is the immediate field, 
not the entire instruction. There is no way to read the complete contents of the 
instruction register onto the bus. 


The PSW Register. The register that holds the program status word (PSW) 
is the most complicated register specification. Each of the 6 bits used for the CPU 
status are individually addressable for the input and output, much as if they were 
6 independent flipflops. The unit functions as a register as well, with input and 
output ports for reading and writing the entire for the program status word at 
once. 
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\~itf PSW.SPEC rep set elk prt in out ie sm contents 

vf nf cf zf 

s_sm c_sm s.ie c_ie ld_v ld_n ld_c ld_z = 

V t:time. 

(contents (t+1) = 

((set t) A (get.sm rep (contents t))) -♦ 

(in t) | 

(elk t) — » 

(mk.psw rep ( 

(s.sm t -+ T I 

c.sm t -* F | (get.sm rep (contents t))), 
(s.ie t — ► T I 

c.ie t — ► F I (get.ie rep (contents t))), 
(ld.v t -* vf | (get.vf rep (contents t))), 

(ld.n t -» nf I (get.nf rep (contents t))), 

(ld.c t — ► cf I (get.cf rep (contents t))), 

(ld.z t —► zf | (get.zf rep (contents t))))) I 

(contents t)) A 

(sm t = get.sm rep (contents t)) A 
(ie t = get.ie rep (contents t)) A 
(prt t ==> (out = contents)) 


The PSW register operates as follows: 

e When the set line is high and the supervisory mode bit is set, then the current 
contents are replaced by the value on the in port. 

e When the elk line is high, the new value of the PSW is constructed from the 
input ports for the individual fields, provided that their associated load lines 
are high. 

• The sm port gets the current value of the supervisory mode bit. 

• The ie port gets the current value of the interrupt enable mode bit. 

• When the prt line is high, the output port holds the current contents of the 
register. 


The Jump Circuitry. As mentioned in Section 5.1, the jump instruction 
in AVM-1 uses the 4 least significant bits of the destination field to select a jump 
condition. Calculating jump codes could be done in the microcode, but would be 
extremely slow. The electronic block model contains a special block for calculating 
j um p codes based on the current PSW and the destination field of the instruction. 


f-j e y JUMP .SPEC rep d psw out * 

V t:time . 

(out t) * JUMP.COND rep (reg.len rep (d t)) (psw t) 


97 




The definition relies on an auxiliary definition, JUMP-COND 



The Memory Buffer Register. The memory buffer register has a complicated 
porting arrangement. The register has one bi-directional port, mem_port, a second 
input port, in, and a second output port, bus. 


\~i t j HBE.SPEC set elk rd_s wr_s in value bus mem_port = 

(V t:time. 

((value (t+1) * (((elk t) A (rd_s t)) — ► mem.port t | 

((elk t) A (set t)) — ► in t | value t)) A 
(wr_s t => (mem.port * value)))) A 
(bus ■ value) 


The specification describes three different parts of the register: 

1. The new value of the register is the value on the memory port if the clock, elk, 
and the read line, rd_s are high. Otherwise, if the clock and the set line are 
high, the new value is the value of the input port. If neither of these conditions 
is true, then the value of the register is unchanged. 

2. Then memory port carries the value of the register only if the write line, wr_s 
in high. 

3. The value on the output bus is always the value of the register. 
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The Interrupt Vector Register. IVEC.SPEC describes the interrupt vector 
register. This register does not actually reside on the CPU, but is shared with the 
interrupt controller. Thus, the following specification can be thought of as a partial 
specification for the interrupt controller; the only part specified is the part that the 
CPU actually uses to read the interrupt vector. 


I ~i t f IVEC.SPEC rep prt out contents * 

V t:time . 

(contents (t+1) = (contents t)) A 

(prt t =^> (out t = (int.fetch rep (contents t)))) 


The Demultiplexors. The specification of the electronic block model makes 
use of several demultiplexors. The following specification describes a 2— to— 4 demul- 
tiplexor. The specification of a 3-to-8 is similar. 


I - dtf DEMUX_2_SPEC s oO ol o2 o3 = 

(V t . oO t = ((s t) = (F,F))) A 
(V t . ol t = ((s t) = (F,T))) A 
(V t . o2 t = ((s t) = (T.F))) A 
(V t . o3 t = ( (s t) = (T.T))) 


The Memory Block. The operation of the memory block is based on the 
operation of two abstract functions: store and fetch. Memory has a single bi- 
directional data port, data, an address port, a read signal, rd_s and a write signal, 
wr_s. 


I ~i t f MEM rep wr_s rd_s addr data mem » 

V t:time . 

(mem (t+1) = 

(wr_s t -*■ store rep (mem t, address rep (addr t) , (data t)) 
I mem t)) A 
(rd_s t =>■ 

(data t - (fetch rep (mem t, address rep (addr t))))) 


When the write signal is high, the new value of memory is the result returned 
by applying store to the old memory, the address, and the data. Otherwise, the 
value of memory is unchanged. If the read signal is true, then the value of memory 
returned by the fetch function is placed on the data port. 

We specify the memory as one of the blocks in the electronic block model. There 
are other ways of specifying memory. For example, Joyce [Joy89a] separates the 
memory block from the electronic block model and then uses the combination to 
verify the macro-level. There is, however, little difference in meaning. 


99 






h AVM.ALU.SPEC rep switch(in_A,in_B,cin)out(neg, zero, ovfl, carry) * 
((switch * F,F,F,F) — ♦ 

ADD.WITHOUT.CARRY rep(in_A,in_B,cin)out(neg, zero, ovfl .carry) I 
((switch « F,F,F,T) -► 

ADD_WITH_CARRY rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) | 
((switch = F,F,T,F) — »• 

INCREMENT rep (in_A,in_B,cin) out (neg, zero, ovfl, carry) | 

((switch = F,F,T,T) — ► 

SUB_WITHOUT_CARRY rep (in_A,in_B,cin) out (neg, zero, ovfl .carry) | 
((switch * F,T,F,F) — ► 

SUB.WITH.CARRY rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) I 
((switch * F,T,F,T) -* 

DECREMENT rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) I 
((switch « F,T,T,F) -» 

BITWISE.AND rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) | 
((switch = F,T,T,T) — ► 

BITWISE_XOR rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) | 
((switch = T,F,F,F) — ♦ 

BITWISE.OR rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) I 
((switch « T,F,F,T) -+ 

BITWISE.NOT rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) | 
((switch * T,F,T,F) — *• 

ALU.NOOP rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) | 

((switch * T,F,T,T) — > 

ALU.NOOP rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) | 

((switch ■ T,T,F,F) 

ALU.NOOP rep(in_A,in_B,cin)out(neg, zero, ovfl, carry) | 

((switch = T,T,F,T) -► 

ALU.NOOP rep(in_A,in_B,c in) out (neg, zero, ovfl, carry) | 

((switch * T,T,T,F) — » 

ALU.NOOP rep (in. A, in_B,c in) out (neg, zero, ovfl, carry) I 
ALU.NOOP rep ( in. A , in.B , cin) out (neg , zero , ovfl , carry) 
))))))))))))))) 


Figure 5.6: The ALU Specification for AVM-1. 
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The ALU Block. The ALU definition used in the specification of the electronic 
block model is shown in Figure 5.6. The ALU selects one of 16 functions based on 
the value of a 4-bit input, switch. Only 10 of the 16 available functions are used in 
our implementation. The complete specification gives the formal definition of each 
of the functions, including how flags are set for each operation. The ALU performs 
addition, with and without carry; incrementing; subtraction, with and without 
carry; decrementing; and bitwise disjunction, conjunction, exclusive disjunction, 
and negation. The 16 functions are filled out with a NOOP operation that passes the 
A input through unchanged, but sets the appropriate flags. The functions operate 
on the A and B inputs (in_A and in_B respectively) and produce the output, out. In 
addition, there is a carry in, cin, and four result flags indicating a negative result, 
a zero result, overflow and carry. 

The auxiliary functions used to define the ALU are defined in terms of the abstract 
word package. For example, here is the auxiliary function used to define addition 
without carry: 


1 ~dtf ADD_WITHOUT_CARRY rep (in_A,in_B,cin) out 

(neg, zero, ovfl, carry) = 
let result = (add rep) (in_A,in_B) in 
let c = (addp rep) (in_A,in_B, result) and 
n = (negp rep) result and 
z = (zerop rep) result and 
v = (aovfl rep) (in_ A, in_B, result) in 
((out ■ result) A 
(neg = n) A 
(zero = z) A 
(ovfl = v) A 
(carry = c)) 


This predicate specifies addition without carry simply because it uses the auxiliary 
functions that we have decided describe that operation. In fact, this specification 
makes no statement about what addition without carry means. Furthermore, we 
will not prove that the ALU adds correctly or performs any other mathematical 
operation. What we will prove is that the primitive operations are called in such a 
way that the top level specification is met. 


The Shifter Block. The shifter has four functions: 1-bit shift left, 1-bit shift 
right, 1-bit arithmetic shift right, and a NOOP. The functions are selected by a 
2-bit switch. 
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h it} SHIFTER.SPEC 

rep switch 

in_A 

out c.flag = 




((switch « 

(F ,F)) -» 

((out 

* (shl rep) 

in. 

.A) A 




(c.flag * (msb rep) 

in.A) ) 

1 

(switch * 

(F.T)) -* 

((out 

* (shr rep) 

in. 

.A) A 




(c.flag * (lsb rep) 

in.A) ) 

1 

(switch * 

(T.F)) -» 

((out 

■ (asr rep) 

in. 

.A) A 




(c.flag * (lsb rep) 

in.A) ) 

1 



((out 

« in. A) A 






(c.flag = F)) 



) 


The specification of the shifter is also given in terms of the abstract representation 
for the microprocessor. In addition to calculating the output of the shifter, the 
specification produces a carry corresponding to the bit shifted out of the word. 


Miscellaneous Logic. In addition to several and-gates and or-gates, the spec- 
ification makes use of several larger chunks of logic to describe the selection signals 
for the memory address register and the program counter register. 


\~i t j MAR_LOGIC_SPEC pmux clk_3 clk_4 mar out = 

V t:time. (out t) ■ 

((((pmux t) A (clk_3 t)) V 

("(pmux t) A (clk_4 t))) A (mar t)) 

\~dcf PC_LOGIC_SPEC elk pc.enable pc.jmp.enable jump .flag out = 
V t:time. (out t) » (elk t) A 

((pc.enable t) V 

((pc.jmp.enable t) A (jump.flag t))) 


The Datapath Specification. The datapath definition (shown in Figure 5.7) is 
made from the blocks that we have specified. We specify the internal lines using 
existential quantification. The specification of the datapath is difficult to read; it 
is also difficult to write. There is, however, a close correspondence between the 
major blocks, the internal lines, and the external lines in the specification given in 
Figure 5.7 and the circuit diagram shown in Figure 5.2. In a production setting, 
the structural specification could be derived from a CAD description of the circuit, 
given the appropriate definitions for the blocks. This would make the specification 
of the electronic block model much easier. 

Some of the blocks in the datapath expect arguments that are functions of time 
and others do not. The use of the blocks in the specification of the datapath reflects 
this. For example, MUX .SPEC uses (Muxln t) as an argument, while MBR.SPEC simply 
uses Muxln. 
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I ~i t f DATAPATH rep mea reg mar mbr alatch blatch ir pc psv ivec 
iack.ff ireq.ff ireq.e amux.s alu_s shf t_s mbr.s 
mar.s pmux.s cselect aselect bselect 
s_sm c_sm s_ie c.ie ld_c ld_v ld.n ld.z csrc_s 
iack.s rd.s vr.s opc ie sm clk_l dk_2 clk_3 clk_4 ■ 

V t:time. 

3 Abus Bbus Cbus Mur Out Muzln MemData AluOut Gnd Marin 
regd.enable ssp. enable psv.enable ir.enable pc. enable 
pc.jmp.enable reg.a.enable reg.sa.enable ssp.a.enable 
psv.a.enable C255. enable pc.a.enable reg.b.enable 
ivec.enable ir.b.enable ld.reg.block ld.ssp ld.ir 
ld.psv ld.mar ld.pc do.vrite dest.s srca.s srcb.s 
alu.c shift.c cf nf vf zf jump.flag pc.a.l pc_a_2 
pc_a_3 ir.b.l ir_b_2 floatO floatl . 

(GND (Gnd t)) A 

(DEMUX.3.SPEC cselect regd.enable ssp.enable psv.enable 

ir.enable pc.enable pc.jmp.enable 
floatO floatl) A 

(DEMUX.3.SPEC aselect reg.a.enable reg.sa.enable ssp.a.enable 

psv.a.enable C255_enable pc.a.l 
pc_a_2 pc_a_3) A 

(0H.3.SPEC pc.a.l pc_a_2 pc_a_3 pc.a.enable) A 
(DEMUX.2.SPEC bselect reg.b.enable 

ivec.enable ir.b.l ir_b_2) A 
(OR.SPEC ir.b.l ir_b_2 ir.b.enable) A 
(AND.SPEC clk_4 regd.enable ld.reg.block) A 
(AND.SPEC clk_4 ssp.enable ld.Bsp) A 
(REGISTER.BLOCK rep dest.s srca.s srcb.s 

ld.reg.block ld.ssp reg.a.enable 
reg.sa.enable ssp.a.enable reg_b_enable 
Cbus Abus Bbus psv reg) A 
(AND.SPEC clk.4 ir.enable ld.ir) A 
(IR.SPEC rep ld.ir ir.b.enable Cbus Bbus ir 
opc dest.s srca.s srcb.s) A 


Figure 5.7: The specification for the datapath (continued on next page). 
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... A 

(LATCH.SPEC Abus clk_2 alatch) A 
(LATCH.SPEC Bbus clk_2 blatch) A 
(IVEC.SPEC rep ivec.enable Bbus ivec) A 
(FF.SPEC iack.s clk.2 iack.ff) A 
(FF.SPEC ireq.e clk.l ireq.ff) A 

(MUX.SPEC (amux.s t) (Muxln t) (alatch t) (MuxOut t)) A 
(MAC2_ALU_SPEC rep 
(alu.s t) 

(MuxOut t, blatch t.get.cf rep (psw t)) 

(AluOut t) (nf t, zf t,vf t,alu_c t)) A 
(SHIFTER.SPEC rep (shft.s t) (AluOut t) (Cbus t) 

(shift.c t)) A 

(MUX.l.SPEC (csrc.s t) (alu.c t) (shift.c t) (cf t)) A 
(MBR.SPEC mbr_s clk_4 rd_s wr_s Cbus mbr Muxln MemData) A 
(AND.SPEC clk_4 psw.enable ld.psw) A 

(PSW.SPEC rep ld_psw clk_4 psw.a.enable Cbus Abus ie sm psw 
(vf t) (nf t) (cf t) (zf t) 
s_sm c.sm s_ie c_ie ld_v ld_n ld_c ld_z) A 
(JUMP.SPEC rep dest.s psw jump.flag) A 
(PC_LOGIC_SPEC clk_4 pc_enable pc_jmp_enable 
jump.flag ld_pc) A 

(REG.SPEC Cbus ld_pc pc_a_enable Abus pc) A 

(C255.SPEC rep (C255.enable t) (Abus t)) A 

(MUX.SPEC (pmux.s t) (pc t) (Cbus t) (Marin t)) A 

(MAR.LOGIC.SPEC pmux.s clk_3 clk_4 mair.s ld.mar) A 

(LATCH.SPEC Marin ld.mar mar) A 

(AND.SPEC clk_4 wr.s do.write) A 

(MEM rep do.write rd.s max MemData mem) 


Figure 5.8: The specification for the datapath (continued). 





The Control Unit Blocks. Now that we have specified the datapath, we will 
turn our attention to the control unit. The control unit has three main parts: (1) the 
microprogram counter and its associated logic, (2) the microinstruction register, and 
(3) the clock. The microrom, is specified as a variable. The microrom specification 
for the microcode in AVM-1 is described in Section 5.2.4. 


The Microprogram Unit Block. The microprogram unit calculates the next 
value of the microprogram counter from the current value, the contents of the 
microinstruction register, and some signals from the datapath. The microprogram 
counter is 6 bits wide; the function add_bt6 adds a boolean 6-tuple and a number. 


h dt j MPCJJNIT mpc opc addr cond ireq_f ie sm = 
let bt6_inc n * (add_bt6 n 1) in 
((cond * (F,F,F)) — ♦ (bt6_inc mpc) I 
(cond = (F,F,T)) — * addr | 

(cond = (F,T,F)) -> (add_bt6 (F,(SND opc)) 4) I 
(cond = (F,T,T)) -*• 

((ireq_f A ie) — > addr I 

(bt6_inc mpc)) I 

(cond ■ (T,F,F)) — ► (sm — ♦ addr I (bt6_inc mpc)) I 

(bt6_inc mpc)) 


There are 5 jump conditions: 

1. No jump; the microprogram counter is incremented. This is the default oper- 
ation. 

2. Jump to addr unconditionally 

3. Jump to the location given by the opc signal plus an offset (4 in this case). 
This allows us to use a table lookup approach to instruction decoding in the 
microcode. Note that no matter what the opcode is, we only use the 5 least 
significant bits for a value. The top half of the instruction set is reserved for 
a coprocessor. 

4. .Tump to addr if the interrupt signal is true and interrupts are enabled. 

5. Jump to addr if the supervisory mode signal is true. 

We use the above definition in specifying the microprogram counter: 
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I ~itf HPC.SPEC rep mpc elk ope irq ie sm addr.s cond.s * 

V trtime. 
mpc (t+1) ■ 

((elk t) -+ 

(MPC.UNIT (mpc t) (ope t) (addr_s t) 

(cond.s t) (irq t) (ie t) (sm t)) | 

mpc t) 


When the elk signal is high, the new value of the microprogram counter is calculated 
using MPC-UNIT. Otherwise, the value remains unchanged. 


The Microinstruction Register Block. The microinstruction register is sim- 
ple in concept, but rather unwieldy to specify. The specification describes a register 
with 25 ports — one corresponding to each of the fields in the microinstruction. The 
following specification uses selection functions on microinstructions to produce the 
various fields. For example, Alu is a selector on a microinstruction that returns a 
4-bit field giving the ALU operation. 
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I ~i t f HIR.SPEC mir elk in 

amux.s sh_s alu_s mbr_s mar_s pmux.s cselect 
aselect bselect s_sm_s c_sm_s s_ie_s c_ie_s 
ld_c_s ld.v.s ld_n_8 ld_z_s csrc_8 ftch.s 
iack.s rd_s wr_s addr_s cond_s * 

V t:time . 

(mir (t+1) = (elk t -+ (in t) I (mir t))) A 

(amux.s t = (Amux (mir t))) A 

(sh_s t * (Shift (mir t))) A 

(alu.s t « (Alu (mir t))) A 

(mbr.s t * (Mbr (mir t))) A 

(mar_8 t = (Mar (mir t))) A 

(pmux.s t * (Pmux (mir t))) A 

(cselect t - (Trgt (mir t))) A 

(aselect t = (SrcA (mir t))) A 

(bselect t - (SrcB (mir t))) A 

(s_sm_8 t * (S_sm (mir t))) A 

(c_sm_s t = (C_sm (mir t))) A 

(s_ie_s t = (S_ie (mir t))) A 

(c_ie_s t » (C_ie (mir t))) A 

(ld_c_s t = (Ld_c (mir t))) A 

(ld_v_s t = (Ld_v (mir t))) A 

(ld_n_s t = (Ld_n (mir t))) A 

(ld_z_s t = (Ld_z (mir t))) A 

(csrc_s t = (Csrc (mir t))) A 

(ftch_s t = (Ftch (mir t))) A 

(iack_s t = (lack (mir t))) A 

(rd_s t « (Rd (mir t))) A 

(wr_s t ■ (Wr (mir t))) A 

(addr.s t = (Address (mir t))) A 

(cond.s t * (Cond (mir t))) 


The Clock Block. The clock is a four-valued counter with a strobe line for 
each of the phases. The counter sequences from 0 to 3. The strobe clk_l is only 
high in the first clock phase, the strobe clk_2 is only high in the second clock phase, 
and so on. 

h itf CLOCK.SPEC elk clk.l clk_2 clk_3 clk_4 = 

V t:time. 

(elk (t+1) = (((elk t) = (F.F)) -*■ (F,T) I 

((elk t) = (F,T)) -+ (T.F) I 

((elk t) = (T.F)) -+ (T.T) I (F.F))) A 

(clk_l t = (elk t = (F.F))) A 

(clk_2 t = (elk t = (F.T))) A 

(clk.3 t = (elk t = (T.F))) A 

(clk_4 t = (elk t = (T.T))) 
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The Control Unit. The control unit is specified by connecting the behavioral 
specifications for the microprogram counter, microinstruction register, and clock. 
The only internal lines carry the address and jump condition portions of the mi- 
croinstruction from the microinstruction register to the microprogram counter unit. 


h def CONTROL.UNIT rep 

mpc mir elk urom 
clk.l clk_2 clk_3 clk_4 

amux.s sh_s alu.s mbr_s mar_s pmux.s cselect aselect 
bselect s.sm c.sm s_ie c_ie ld_c ld.v ld_n ld_z csrc_s 
ftch_s iack_s rd.s wr.s opc sm ie ireq.f * 

3 addr_s cond_s . 

(MPC.SPEC rep mpc clk_4 opc ireq.f ie sm addr.s cond.s) A 
(MIR.SPEC mir clk.l (A t. (urom t (bt6_val (mpc t)))) 
amux.s sh.s alu.s mbr.s mar_s pmux.s cselect 
aselect bselect s.sm c.sm s.ie c.ie ld.c ld.v 
ld.n ld.z csrc.s ftch.s 
iack.s rd.s wr.s addr.s cond.s) A 
(CLOCK.SPEC elk clk.l clk_2 clk.3 clk_4) 


EBM State. Before we put the datapath and the control unit together to specify 
the structure of the electronic block model, we describe the state that is visible at 
this level. The following state-tuple is used to describe the state at the electronic 
block model level. 

(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 
alatch, blatch, ireq.ff, iack.ff, mir, urom, elk) 

The state-tuples for more abstract levels will contain a subset of the members of 
the state-tuple at this level. We have kept the names consistent between levels for 
clarity. 

• reg - A variable of type : (*vordn)list used to represent the register file. 

• psw - A variable of type : *vordn used to represent the program status word. 

• pc - A variable of type :*wordn used to represent the program counter. 

• mem - A variable of type : *memory used to represent external memory. 

• ivec -A variable of type :*vordn used to represent the interrupt vector. 

• ir - A variable of type :*wordn used to represent the instruction register. 

• mar - A variable of type :*wordn used to represent the memory address 
register. 
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• mbr - A variable of type : *sordn used to represent the memory buffer reg- 
ister. 

• mpc - A variable of type :bt6 (boolean 6-tuple) used to represent the mi- 
croprogram counter. 

• alatch - A variable of type :*vordn used to represent the latch feeding the 
A side of the ALU. 

• blatch — A variable of type : *vordn used to represent the latch feeding the 
B side of the ALU. 

• ireq_ff - A variable of type :bool used to represent the interrupt request 
flipflop. 

• iack_ff- A variable of type :bool used to represent the interrupt acknowledge 
flipflop. 

• mir - A variable of type :ucode (a complex bit-string) used to represent the 
microinstruction register. 

• urom — A variable of type :num — * ucode used to represent the microrom. 

• elk - A variable of type :bt2 (boolean 2-tuple) used to represent the phase- 
level clock. 

The EBM Specification. The electronic block model is specified by connecting 
the datapath and the control unit using existential quantification to represent in- 
ternal fines. We want a definition of the electronic block model that can be used 
with the generic interpreter specification. The electronic block model is used to 
instantiate the abstract implementation, Impl, which has the abstract type 

:(time’ — ► *state’) — ► (time’ — ♦ *env’) — ♦ bool 

The definition must take two functions of time, one representing the state stream 
and the other the environment stream and return a boolean. We use tuples, ab- 
stracted over time to represent these state and environment streams. 
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I- EBM 
rep 
(A t. 

(reg t,psw t,pc t,mem t.ivec t,ir t,mar t,mbr t,mpc t, 
alatch t.blatch t.ireq.ff t.iack.ff t,mir t,urom,clk t)) 

(A t. (ireq.e t)) = 

(3 amux.s alu.s shft.s mbr.s mar.s pmux.s cselect aselect 
bselect s.sm c.sm s.ie c_ie ld_c ld_v ld_n ld_z csrc.s 
iack.s rd_s wr_s ftch.s opc ie sm clk_l clk_2 clk_3 clk_4. 
DATAPATH rep 

nem reg mar mbr alatch blatch ir pc pew 
ivec iack.ff ireq.ff ireq.e amux.s alu.s 
shft.s mbr.s mar.s pmux.s cselect aselect bselect 
s.sm c.sm s.ie c.ie ld.c ld.v ld.n ld.z csrc.s 
iack.s rd.s wr.s opc ie sm 
clk.l clk.2 clk_3 clk_4 A 
CONTROL.UNIT rep 

mpc mir elk (A t. urom) clk.l clk_2 clk_3 clk_4 
amux.s shft.s alu.B mbr.s mar.s pmux.s 
cselect aselect bselect s.sm c.sm s.ie c.ie 
ld.c ld.v ld.n ld.z csrc.s ftch.s iack.s 
rd.s wr.s opc sm ie ireq.ff) 


The above specification is not a definition, but rather a theorem; a definition cannot 
have lambda abstractions on the left-hand side. To create this theorem, we define 
the electronic block model using single variables for the state and the environment 
and selectors on those variables. Using that definition and the definition of the state 
selectors, we can derive the theorem given above. 


The EBM Clock. There are two other parts of the abstract representation that 
need to be instantiated with definitions related to the specification of the electronic 
block model. We must define a function representing count, which takes the elec- 
tronic block model state and environment streams and returns the clock. We must 
also define a constant begin that designates the beginning state for the electronic 
block model clock. There’s one small problem: the electronic block model clock and 
the phase-level clock are the same; in other words, there is no temporal abstract 
between those two levels. We can still use the generic interpreter theory, however, 
since we can model this using a 1 -phase clock at the electronic block model level. 

The following definitions for GetEBMClock and EBM_Begin, which represent count 
and begin respectively, implement a 1-phase clock. There are many ways of doing 
this; we chose to use an arbitrary boolean value to represent the single phase. 
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h i e f GetEBMClock rep (reg, psw, pc, mem, ivec, ir, mar, 

mbr, mpc, alatch, blatch, ireq.ff, 
iack.ff, mir, urom, elk) 

(int_e) * e x:bool. F 

h i t f EBM.Begin = e x:bool. F 


The expression e x :bool . F chooses a boolean value for x such that the expression 
F is true. Since F can never be true, we get an arbitrary value of the same type as 
x, boolean. 

5.2.3 Defining the Phase Level. 

The phase-level represents the lowest level interpreter in our hierarchy. It is really a 
reflection of the electronic block model rather than an abstraction. All of the state 
present in the electronic block model is present in the phase-level and they share 
the same clock. Although we only show that the electronic block model implies the 
phase-level, we could show that they are equivalent. We first present an informal 
description of the phase-level interpreter and then present the formal definitions. 


Defining the Phase-Level State. The state-tuple that describes the phase- 
level interpreter state is identical to the tuple describing the state of the electronic 
block model. 

(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 
alatch, blatch, ireq.ff, iack.ff, mir, urom, elk) 

The variables have the same meaning as they did in the electronic block model. 


Defining the Instruction List. The operation of the phase-level interpreter is 
fairly simple. We associate each phase in the system clock with an instruction in 
the phase-level interpreter. The instructions define the state transitions that occur 
during each phase of the clock. This same information is available in the electronic 
block model, but is not as apparent. During the four phases, the machine performs 
the following state transitions: 

1. In phase 1, the microinstruction register is loaded from the microrom. 

2. In phase 2, the latches feeding ALU axe loaded from the register file and 
system registers. 

3. In phase 3, the ALU and shifter calculate a result based on their inputs. 
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4. In phase 4, the result calculated in phase 3 is stored back into the register file 
and system registers. 


The formal definitions for these phases describe in detail what happens at each 
phase. 


Phase— One. During the first phase, the microinstruction register is loaded with 
the contents of the microrom at the location given by the current microprogram 
counter, the flip-flop holding the interrupt request is latched from the interrupt 
request line in the environment, and the clock is updated so that the second phase 
is selected next. 


I -jtf phase_one rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 
alatch, blatch, ireq.ff, iack.ff, mir, urom, 
elk) 

(int_e) = 

let new_mir = urom (bt6_val mpc) and 
new.ireq.ff = int_e and 
new.clk = (F,T) in 

(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 
alatch, blatch, nev_ireq_ff, iack.ff, new.mir, 
urom, new.clk) 


Phase— Two. During the second phase, the latches that feed the ALU are 
loaded from the register file and system registers according to the SrcA and SrcB 
fields in the microinstruction register. In addition, the interrupt acknowledge flip- 
flop is set if the interrupt acknowledge field is set in the microinstruction register. 
The clock is updated to select the third phase. 
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I - itf phase.two rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 

alatch, blatch, ireq.ff, iack.ff, mir, urom, 
elk) 

(int_e) = 

let new_ alatch = ( 

((SrcA mir) = (F,F,F)) — ► 

(EL (reg.len rep (srea rep ir)) reg) I 
((SrcA mir) * (F,F,T)) -* 

(EL (reg.len rep (dest rep ir)) reg) I 
((SrcA mir) * (F,T,F) ) -» (SSP.REG reg) I 
((SrcA mir) * (F,T,T) ) -♦ psw I 
((SrcA mir) = (T,F,F)) — ♦ (wordn rep 255) I 

pc) in 

let new.blatch * ( 

((SrcB mir) * (F,F)) — ♦ 

(EL (reg.len rep (sreb rep ir)) reg) I 
((SrcB mir) = (F,T) ) -» (int.fetch rep ivec) I 

(imm rep ir)) in 

let new_iack_ff = lack mir and 
new_clk = (T,F) in 

(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 
new_ alatch, new.blatch, ireq.ff, new. iack.ff, 
mir, urom, new. elk) 

Note that setting the interrupt acknowledge flip-flop in this phase is not conditioned 
upon the value of the interrupt request flip-flop set in phase one, but the current 
contents of the microinstruction register. Any connection between the values on 
these lines is established in the microcode, not in the hardware. 


Phase— Three. The primary function of the third phase is to allow the result 
from the ALU and shifter to stabilize. In addition, the memory address register is 
loaded from the program counter if the Mar and Pmux fields fire high in the current 
microinstruction. The clock is updated to select phase four. 


hje/ phase.three rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 

alatch, blatch, ireq.ff, iack.ff, mir, urom, 
elk) 

(int.e) = 

let new.mar ■ (((Pmux mir) A (Mar mir)) — * pc I mar) and 
new.clk = (T,T) in 

(reg, psw, pc, mem, ivec, ir, new.mar, mbr, mpc, 
alatch, blatch, ireq.ff, iack.ff, mir, urom, new.clk) 


Phase— Four. Phase four (shown in Figure 5.9) is the busiest of the four phases. 
The progr am status word is updated, the results from the ALU and shifter are stored 
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\~itf phase.four rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 

alatch, blatch, ireq_ff, iack.ff, mir, uxom, 
elk) 

(int.e) * 

let a_input * ((Amux mir) -* mbr | alatch) in 
let carry.in * (get.cf rep psw) in 
let alu_result = 

ALU.FTJNC rep (Alu mir) a.input blatch carry.in in 
let cf = 

ALU.CARRY.FUNC rep (Alu mir) a.input blatch carry.in in 
let vf = 

ALU.OVFL.FUNC rep (Alu mir) a.input blatch carry.in in 
let nf = 

ALU.NEG.FUNC rep (Alu mir) a.input blatch carry.in in 
let zf = 

ALU.ZERO.FUNC rep (Alu mir) a.input blatch carry.in in 
let result * SHIFTER.FUNC rep (Shift mir) alu.result in 
let shft.c * SHIFTER.CARRY.FUNC rep (Shift mir) alu.result in 
let opc = (opcode rep ir) in 
let ie * (get.ie rep psw) and 

sm ■ (get.sm rep psw) in 

let new.psw = ( 

(((Trgt mir) = (F,T,F)) A sm) — ► result | 

(mk.psw rep ( 

((S.sm mir) — ♦ T | (C.sm mir) — ► F I sm), 

((S.ie mir) -» T | (C.ie mir) — ► F | ie) , 

((Ld.v mir) — * vf | (get.vf rep psw)), 

((Ld.n mir) — » nf I (get.nf rep psw)), 

((Ld.c mir) —> 

((Csrc mir) —* cf | shft.c) | (get.cf rep psw)), 
((Ld.z mir) — > zf I (get.zf rep psw))))) in 


Figure 5.9: Phase four of the phase-level interpreter (continued on next page). 
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let new.reg = ( 

((Trgt mir) * (F,F,F)) -» 

(UPDATE.REG rep psw 

(reg_len rep (dest rep ir)) reg result) I 
((Trgt mir) * (F,F,T)) — > 

(UPDATE.REG rep psw ssp.reg reg result) I 
reg) in 

let new.mpc = ( 

MPC.UNIT mpc opc (Address mir) 

(Cond mir) ireq.ff ie sm) in 
let new_ir = (((Trgt mir) = (F,T,T)) — » result I ir) in 
let jmp * (JUMP.COND rep (reg.len rep (dest rep ir)) psw) in 
let new_pc = ( 

((Trgt mir) ® (T,F,F)) — ► result I 
(((Trgt mir) = (T,F,T)) A jmp) -*• result | pc) in 
let new.mbr = ( 

(Rd mir) -* (fetch rep (mem, address rep mar)) I 
(Mbr mir) — ► result I 
mbr) in 

let new_mar = 

(('(Pmux mir) A (Mar mir)) — » result | mar) in 
let new.mem = 

((Wr mir) — » store rep (mem, address rep mar, mbr) 

I mem) in 

let new.clk = (F,F) in 

(new.reg, new.psw, new_pc, new.mem, ivec, new.ir, new.mz^ - , 
new.mbr, new_mpc, alatch, blatch, ireq.ff, iack.ff, mir, 
urom, new.clk) 


Figure 5.10: Phase four of the phase-level interpreter (continued). 
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back in the register file and other system registers, the microprogram counter is 
updated, the memory buffer register is updated, and a new value of memory is 
calculated. 

The specification of the fourth phase is dependent on several auxiliary functions. 
For example, the result from the ALU is calculated by ALUJFUNC. 


\~i t f ALU.FUNC rep s a.input blatch carry.in = 

((s * (F,F,F,F)) — ► (add rep (a.input, blatch)) I 

(s = (F,F,F,T)) — > (addc rep (a.input, blatch, carry.in)) I 

(s * (F,F,T,F)) -» (inc rep a.input) I 

(s = (F,F,T,T)) — ► (sub rep (a.input .blatch)) I 

(s * (F,T,F,F)) — > (subc rep (a.input, blatch, carry.in)) | 

(s * (F,T,F,T)) -* (dec rep a.input) I 

(s = (F,T,T,F)) — *• (band rep (a.input .blatch)) | 

(s * (F,T,T,T)) — » (bxor rep (a.input .blatch)) I 

(s = (T,F,F,F)) — ♦ (bor rep (a.input .blatch) ) | 

(s * (T,F,F,T)) — ► (bnot rep a.input) | 

a.input) 


The auxiliary functions keep the specification of the fourth phase from being more 
unwieldy than it already is and significantly reduce the amount of time to verify 
the phase-level since the time to rewrite a term grows exponentially with its size. 

An interesting point in the specification of the fourth phase is that the we calculate 
the value of the microprogram counter using the same function, MPCLUNIT, that 
we do in the electronic block model. The specification for MPCJJNIT represents 
a functionality assumed at every level in the specification. Thus, we have not 
proven very much about the microprogram unit, only that it is hooked up correctly. 
As we mentioned earlier, we have not implemented and verified the blocks in the 
electronic block model in this proof. The goal of this work is a demonstration that 
the generic interpreter theory and hierarchical decomposition work. The proof of 
low-level objects is orthogonal to this goal. We believe, however, that the abstract 
specifications used at different levels are reasonable and that circuits meeting our 
specification could be built. 


Defining select. The abstract function select returns a key based on the value 
of the state and the environment. In the case of the phase-level, the key is simply 
the clock. 


\~i t f GetPhaseClock rep 

(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 
alatch, blatch, ireq.ff, iack.ff, mir, urom, elk) 
(int.e) = elk 
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Defining key. Key transforms a key into a number. Our clock is represented by a 
boolean 2-tuple, so the tuple function bt2_val serves as the representation for key. 


Defining substate. The state is identical at the phase-level and the electronic 
block model; therefore, the substate function is represented using the built-in 
identity function, I. 


Defining subenv. The environment is identical at the phase-level and the elec- 
tronic block model; therefore, the subenv function is represented using the built-in 
identity function, I. 


Defining the Phase— Level Interpreter. Unlike the electronic block model 
specification, we do not combine the phase-level definitions together into a specifi- 
cation for the phase-level. We will use a properly instantiated form of the definition 
of the generic interpreter from Section 2.3 as our phase— level specification. 

In Section 2.3, we defined a generic interpreter, INTERP. The first argument 
to INTERP is the representation. The representation tuple contains the concrete 
instantiations for the abstract objects from the abstract representation, in the order 
that they appear in the abstract representation. Table 5.11 shows the functions 
used to instantiate the abstract representation. The result is a specification of the 
phase-level interpreter: 


\~itf Phase _Int rep s e = 

INTERP 

( [(F,F) ,phase_one rep; 
(F,T) ,phase_two rep; 
(T,F) ,phase_three rep; 
(T,T) ,phase_four rep] , 
bt2_val , 

GetPhaseClock rep, 

I. 

I. 

EBM rep, 

GetEBMClock rep, 
EBM.Start) s e 


Note that the first argument to the phase— level description, Phase_Int, is rep. 
This is a different abstract representation than the one used to describe the generic 
interpreter theory. As we mentioned earlier, the definition of the microprocessor 
is given in terms of an abstract representation for n-bit words, : *wordn. The 
variable rep in the above definition is a representation variable for the abstract 
word data type. 
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Table 5.11: The functions used to instantiate the abstract represen- 
tation of the generic interpreter theory for the phase- 
level. 


Operation 

Instantiation 

inst.list 

list of phase instructions 

key 

bt2_val 

select 

GetPhaseClock 

substate 

The identity function, I 

subenv 

The identity function, I 

Impl 

EBM 

count 

GetEBMClock 

begin 

EBM_Begin 


The definition of our microprocessor has two layers of abstraction. We instantiate 
the generic interpreter theory to get an abstract representation of the microproces- 
sor, which is then instantiated with a word package (for example, vord32, for a 
32-bit microprocessor) to yield a completely specified microprocessor. Thus, rep in 
the above definitions, and all of the definitions and theorems in this section, denotes 
the abstract representation for the microprocessor’s basic data type. 

The definition of Phase.Int is not very satisfying since it does not look like the 
predicate that we expect to see in an microprocessor specification. We can instan- 
tiate the definition using a function from the abstract theory package as follows: 


let Phase.Int = save.thm 
('Phase.Int' , 

BETA.RULE ( 

EXPAND. LET_ RULE 

(instantiate.abstract.def inition 
‘gen_I‘ 

‘ INTERP ‘ 

Phase.Int.def )) 


The string gen_I in the above expression is the name of the generic interpreter theory 
and INTERP gives the name of the definition to instantiate. We expand the let terms 
in the result to create the more familiar top-level predicate specification: 
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I- Phase. Int rep s e = 

(V t. 

s(t + 1) * 

SND 

(EL 

(bt2_val(GetPhaseClock rep (s t)(e t))) 
[(F,F) ,phase_one rep; 

(F,T) .phase, two rep; 

(T,F) ,phase_three rep; 

(T,T) ,phase_four rep]) (s t) (e t)) 


This theorem defines the phase-level interpreter by relating the state at time i + 1 to 
the state and environment at time t. The relationship is based on the n th member 
of the instruction fist where n is calculated from the phase-level clock. 


5.2.4 Defining the Microcode. 

The phase-level interpreter definition is independent of the contents of the micro- 
rom; the microrom appears as a variable. Thus, the microcode is not a level in the 
abstraction, but rather the data that the phase-level interpreter will act upon to 
implement the micro-level interpreter. 

Recall from the discussion of the microinstruction register in Section 5.1.2 that a 
microinstruction consists of 40 bits in 24 fields which can be broken into 4 groups: 
those affecting the operation of the microprocessor, those affecting the program 
status word, those dealing with external signals, and those that are used for mi- 
croinstruction sequencing. Table 5.12 briefly reviews the meaning of these fields. 
Refer to Section 5.1.2 for a more detailed description. 


5.2.4.1 The Microcode Assembler. 

We use ML to assemble the microcode into the bit-strings that will be used by 
the phase-level interpreter to implement the micro-level interpreter. The goal in 
writing this assembler was not to produce a production quality assembler, but 
rather to allow mnemonic names to be used to define the microcode so that errors 
can be reduced. This section will describe how the microassembler is used. For 
implementation details, see [Win90b] . 

The microcode assembler is implemented using four functions, one for each of the 
four groups of fields in the microinstruction. 


The Operations Group. The operations group is specified using a function Oper 
which takes the following 6 arguments: 
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Table 5.12: The microinstruction format for AVM-1. 


Operation Group 


Bits 

Mnemonic 

Description 

1 

AMUX 

Toggle MUX on A-bus 

2 

SHFT 

Shifter function 

4 

ALU 

ALU function 

1 

MAR 

Load MAR from P-Mux 

1 

MBR 

Load MBR from C-bus 

1 

PMUX 

Toggle MUX loading MAR 

3 

SRCA 

A-bus source 

2 

SRCB 

B-bus source 

3 

TRGT 

C-bus target 


Program Status Word Group 


Bits 

Mnemonic 

Description 

1 

S_SM 

Set supervisory mode bit in PSW 

1 

C_SM 

Clear supervisory mode bit in PSW 

1 

S_IE 

Set interrupt enable bit in PSW 

1 

C_IE 

Clear interrupt enable bit in PSW 

1 

LD.C 

Load carry bit in PSW 

1 

LD_V 

Load overflow bit in PSW 

1 

LD.N 

Load negative bit in PSW 

1 

LD_Z 

Load zero bit in PSW 

1 

CSRC 

Source of carry (shifter or alu) 


External Signals Group 


Bits 

Mnemonic 

Description 

1 

IACK 

Interrupt acknowledge signal 

1 

FTCH 

Fetch signal 

1 

RD 

Read signal 

1 

WR 

Write signal 


Microprogram Counter Group 


Bits 

Mnemonic 

Description 

3 

COND 

Microcode jump condition 

6 

ADDR 

Next address 







Table 5.13: Register mnemonics for the microassembler. 


Mnemonic 

Meaning 

reg_f ile 

Register file 

ssp 

Supervisor stack pointer 

ir 

Instruction register 

psw 

Program status word 

pc 

Program counter (unconditional) 

pcj 

Program counter (using jump conditions) 

mar 

Memory address register 

mbr 

Memory buffer register 

noreg 

No register 

mar_gets_pc 

Load MAR from PC 

reg_dest 

Register file (using dest field from IR) 

C255 

Constant value (255) 

ivec 

Interrupt vector 


Table 5.14: Shifter mn emonics for the microassembler. 


Mnemonic 

Meaning 

shl 

Shift left 

shr 

Shift right 

asr 

Arithmetic shift right 

nsh 

No shift 


1. Specifies the target register for the operation using the mnemonic values shown 
in Table 5.13. 

2. Specifies the shifter operation using the mnemonic values shown in Table 5.14. 

3. Specifies the A source register using the mnemonic values shown in Table 5.13. 

4. Specifies the ALU operation using the mnemonic values shown in Table 5.15. 

5. Specifies the B source register using the mnemonic values shown in Table 5.13. 

6. Specifies special operations related to the memory address register and the 
memory buffer register using the mnemonic values shown in Table 5.13. 

Note that not all of the mn emonic values for the registers are allowable in every 
position. For example, nbr can appear in the target field or the source A field, but 
not the B source field. This is not checked by the assembler, so improper use can 
give unexpected results. 

Here are a few examples of the use of Oper: 
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Table 5.15: ALU mnemonics for the microassembler. 


Mnemonic 

Meaning 

add 

Add without carry 

addc 

Add with carry 

inc 

Increment 

sub 

Subtract without borrow 

subc 

Subtract with borrow 

dec 

Decrement 

band 

Bit-wise conjunction 

bxor 

Bit-wise exclusive disjunction 

bor 

Bit-wise disjunction 

bnot 

Bit-wise negation 

nop 

No ALU operation 


Table 5.16: Program status word mnemonics for the mi- 
croassembler. 


Mnemonic 

Meaning 

set_sm 

Set the supervisory mode bit 

clr_sm 

Clear the supervisory mode bit 

set.ie 

Set the interrupt enable bit 

clr.ie 

Clear the interrupt enable bit 

pass 

Take no action 

ld_from_alu 

Load carry from the ALU 

ld_from_shifter 

Load carry from the Shifter 

ld_vf 

Load the overflow bit 

ld_nf 

Load the negative bit 

ld-Zf 

Load the zero bit 


Oper(reg_file,nsh,reg_f ile.add.pc.noreg) ; ; 
Oper(reg_file,shl,mbr .band, reg.file, mar) ; ; 


The first example adds the contents of the register selected by the A source field in 
the instruction register to the program counter and stores the result in the register 
selected by the destination field of the instruction register. The second example 
takes bit-wise conjunction of the MAR with the register selected by the B source 
field in the instruction register, shifts the result left and stores it in the register 
selected by the destination field of the instruction register; the MAR is loaded with 
the result as well. 
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Table 5.17: External signal mnemonics for the microassem- 
bler. 


Mnemonic 

Meaning 

rd 

A read is in progress 

vr 

A write is in progress 

no_mem_op 

No memory operation is in progress 

i.ack 

Set the interrupt acknowledge flag 

off 

Turn the signal off 

in .fetch 

CPU is in a fetch cycle 


The PSW Group. Table 5.16 gives the names and meanings for the mnemonics 
affecting the program status word (PSW). The value of the PSW group of bits is 
declared using function Set_PSW which has 6 arguments. The meaning of the 6 
arguments is given below: 

1. Set, clear, or pass (leave unchanged) the supervisory mode bit. 

2. Set, clear, or pass the interrupt enable bit. 

3. Load the carry bit from either the ALU or the Shifter or take no action. 

4. Load the overflow bit or takes no action. 

5. Load the negative bit or takes no action. 

6. Load the zero bit or takes no action. 

The following examples show how the Set_PSW function is used: 


Set.PSW (set.sm, clr.ie, pass, pass, pass, pass);; 

Set.PSW (pass, pass, ld.from.alu, ld.vf, ld.nf, ld.zf);; 
Set.PSW (pass, pass, ld.from.shifter , ld.vf, ld.nf, ld.zf);; 


The first example, sets the supervisory mode bit, clears the interrupt enable bit, 
and leaves the others unchanged. The second example leaves the supervisory mode 
bit and the interrupt enable bit unchanged and loads the carry bit from the ALU 
as well as setting the other status bits from the last ALU operation. The third 
example differs from the second only in that the carry bit is loaded from the Shifter 
instead of the ALU. Like the Oper function, the Set _PSW function does not check 
for most errors. 
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Table 5.18: Microprogram counter mnemonics for the mi 
croassembler. 


Mnemonic 

Meaning 

step 

Increment the program counter and go there 

j“P 

Jump unconditionally 

jop 

Jump relative to mpc based on current opcode 

jint 

Jump on interrupt 

jsm 

Jump when in supervisory mode 


The External Signals Group. Table 5.17 give the names and meanings for the 
mnemonics affecting the external signals. The value of the group of bits for external 
signals is declared using function ExtSig which has 3 arguments. The meaning of 
the 3 arguments is given below: 

1. Specifies whether or not an interrupts being acknowledged. 

2. Specifies whether or not the CPU is in fetch mode. 

3. Specifies the current memory operation. 

The following examples show how the ExtSig function is used: 


ExtSig(off .off ,rd) ; ; 
ExtSig(i_ack,in_fetch,no_mem_op) ; ; 


In the first example, the microcode turns off interrupt acknowledge, is not in fetch 
mode, and is performing a read. The second example is acknowledging an interrupt, 
is in the fetch portion of its cycle, and has no memory operation occurring. 


The MPC Group. Table 5.18 give the names and meanings for the mnemon- 
ics affecting the microprogram counter. This group of bits is declared using the 
function Mpc which has 2 arguments: 

1. The jump condition. 

2. The address of the next microinstruction for all jump conditions except the 
sequencing operator step 

When a conditional jump fails, the microprogram counter is incremented. The 
following examples show how the Mpc function is used: 
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let TEST.ADDR » " (F ,F,F,F,F ,F) " ; ; 
Mpc (step, TEST. ADDR) ; ; 

HpcCjint, TEST.ADDR) ; ; 


The first example goes to the next instruction in the microrom regardless of the 
address given. The second example jumps to location TEST _ADDR when the interrupt 
fiipflop is set. 


Assembling Microcode. Each of the four functions returns an HOL term con- 
sisting of the appropriately sized n-tuple of boolean values. The four functions can 
be used to specify a microinstruction using HOL’s antiquotation operator: 


" ( “ (Oper (noreg , nsh , noreg , nop , nor eg ,mar .gets.pc ) ) , 
“(Set.PSW (pass, pass, pass, pass, pass, pass)), 
*(ExtSig(off ,off ,rd)) , 

* (Mpc ( j int , EINT.ul .ADDR) ) ) " 


The antiquotation operator (caret) allows an expression that results in a term to be 
incorporated into an explicit term declaration. This example returns a bit— string 
broken into four groups — one for each of the four operations just described. 


5.2.4. 2 The Microinstructions. 

Using the microassembler, we can define the microprogram. The microprogram 
is 53 microwords long and begins at location 0 of the microrom. Most of the 
macroinstructions eure implemented in 4 microinstructions. This section will briefly 
describe the important features of the microprogram and give several examples 
of microinstructions that implement macroinstructions. The complete program is 
contained in [Win90b] . 


The FETCH Instruction. Every macroinstruction begins with the same, three 
microinstruction sequence. The only exception is when an external interrupt is 
being processed. The first microinstruction fetches the instruction from memory to 
be executed next (pointed to by the program counter). 


\- itf FETCH.mc = 

(“ (Oper (noreg , nsh , noreg , nop .noreg .mar.gets.pc) ) , 
‘(Set.PSW (pass, pass, pass, pass, pass, pass)), 
“ (ExtSig(of f ,of f ,rd) ) , 

‘ (Mpc( j int .EINT.ul .ADDR) ) ) 
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If the interrupt flipflop is set, the program branches to the routine that handles 
external interrupts (located at EINT_ul_ADDR). 

The definition of FETCH given above actually gets assembled before it is stored in 
the theory. Here is what the assembled version looks like. 


\- it f FETCH.mc « 

(F, (T,T) , (T,F,F,T) ,F,T,T, (T,T,F) , (T,F ,T) ,T,F) , 
(F,F,F,F,F,F,F,F,F), 

(F.F.T.F), 

(F,T,T),T,T,F,F,F,T 


Throughout this section, we will show the unassembled versions, but they are ac- 
tually stored as bit-strings. 


The ISSUE Instruction. If the interrupt flipflop is not set, the FETCH instruction 
is followed by the ISSUE instruction. The ISSUE instruction moves the word that 
was just fetched from memory to the instruction register. 


d t j ISSUE.mc = 

( “ ( Op er ( ir , nsh , mbr , nop , nor eg, nor eg)), 

“(Set.PSW (pass, pass, pass, pass, pass, pass)), 
“(ExtSigCoff ,off ,no_mem_op)) , 

'(Mpc (step, DUMMY))) 


The DECODE Instruction. The next instruction is the DECODE instruction which 
increments the program counter (in preparation for the next cycle) and branches to 
the locations in the microcode given by the opcode of the word in the instruction 
register plus an offset of 4. Thus, locations 4 through 35 of the microrom are a 
look-up table of microinstructions. The correct microinstruction is selected by the 
opcode of the current macroinstruction. 


I- def DECODE.mc = 

( " (Oper (pc ,nsh ,pc , inc ,noreg ,noreg) ) , 

* (Set.PSW (pass, pass, pass, pass, pass, pass)), 
“(ExtSig(off ,off ,no_mem_op)) , 

“ (Mpc (j op, DUMMY))) 


The jop jump condition does not use the address portion of the microinstruction, 
so a dummy address is used as the addr field. 
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The JMP-ul Instruction. After the instruction has been decoded, the work spe- 
cific to the macroinstruction being implemented is performed. For example, if the 
opcode of the current instruction has a value of 0 (the JMP instruction), then DECODE 
would jump to location 4 and execute the following microinstruction: 


b i e f JMP_ul_mc = 

(*(0per(pcj ,nsh,reg_f ile.add, ir,noreg)) , 

“(Set.PSW (pass, pass, pass, pass, pass, pass)), 
“ (ExtSig (off , off ,no_mem_op) ) , 

* (Mpc ( jmp , FETCH. ADDR) ) ) 


This microinstruction conditionally loads the program counter with the value of 
immediate portion of the instruction register plus the contents of the register in the 
register file selected by the current instruction. The conditional load is based on 
the value of the destination field of the current instruction and the values of the 
status bits in the program status word. After loading the program counter, the 
microinstruction returns to the beginning of the microprogram. 


The ADD.ul Instruction. The ADD macroinstruction is implemented by the fol- 
lowing microinstruction. 


h dtf ADD.ul.mc = 

(* (Oper (reg.f ile , nsh , reg.f ile , add , reg.f ile ,noreg) ) , 
“(Set.PSW (pass, pass, ld.vf, ld.nf, ld.from.alu, ld.zf)), 
“ (ExtSig (off , off ,no_mem_ op)) , 

* (Mpc (jmp , FETCH. ADDR) ) ) 


This instruction takes both operands from the register file and stores the result of 
adding them to the register file. The A source, B source, and destination registers 
in the register file are all selected by the respective fields in the instruction register. 
The ADD instruction sets the appropriate bits in the program status word based on 
the results of the addition. 


The MICROROM Definition. The microrom is a function with domain : num 
smd range : ucode. we define it by using EL to select the n*^ 1 element from a list of 
the microinstructions. 
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I ~ief micro.rom n ■ 

EL n 

[FETCHjnc; 
ISSUE.mc; 
DECODE.mc ; 
N00P_ul_mc; 
JMP.ul.mc; 


ADD_ul_mc; 


N00P_ul_mc] 


5.2.5 Defining the Micro-Level. 

The micro-level interpreter is an abstraction of the behavior described by the phase- 
level interpreter. At the micro-level, we are concerned with the behavior of the 
microinstructions, not their implementation. 


Defining the Micro— Level State. The state-tuple that describes the micro- 
level interpreter state is an abstraction of the state-tuple describing the state of the 
phase-level. 

(reg, psv, pc, mem, ivec, ir, mar, mbr, mpc) 

These variables have the same meaning as they did at the electronic block model 
level. Note that state-variables such as the latches feeding the ALU are no longer 
available. The only state visible at the micro-level is that seen by someone writing 
microcode. 


Defining the Instruction List. The instruction list at the micro-level is the 
same length as the microrom and the keys associated with each instruction are 
identical to the instruction’s location in the microrom (rather than being an opcode). 
We will give abstract behavioral descriptions of each of the microinstructions that 
were described in section 5.2.4. 


The FETCH Instruction. The memory buffer register is loaded with the in- 
struction currently pointed to by the program counter. If the interrupt flag is high 
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and interrupts are enabled, the CPU enters the interrupt routine in the microcode, 
otherwise the next instruction in the microrom is executed. 


h i t f FETCH rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) 

(int.e, reset.e) = 
let new.mar = pc and 

new.mbr * fetch rep (mem, address rep pc) and 
new.mpc = (int_e A (get_ie rep psw)) — ► ~EINT_ul_ADDR 

I add_bt6 mpc 1 in 

(reg, psw, pc, mem, ivec, ir, new.mar, new.mbr, new.mpc) 


The ISSUE Instruction. The contents of the memory buffer register are moved 
into the instruction register. The program continues with the next instruction in 
the microrom. 


h j e f ISSUE rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) 
(int_e, reset.e) = 
let new.ir = mbr and 

new.mpc = add_bt6 mpc 1 in 
(reg, psw, pc, mem, ivec, new.ir, mar, mbr, new_mpc) 


The DECODE Instruction. During this instruction, the program counter is in- 
cremented. The most important action, however, is the jump at the end of the 
instruction to a location based on the current opcode portion of the instruction 
register. 


\~dtf DECODE rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) 
(int.e, reset.e) » 
let new.pc ■ inc rep pc and 

new_mpc * (add_bt6 (F, (SND(opcode rep ir))) 4) in 
(reg, psw, new.pc, mem, ivec, ir, men:, mbr, new.mpc) 


Note that the value used for the look-up is not the entire 6-bit opcode, but only the 
5 least significant bits, padded with a false value in the most significant bit. The 
effect of this is to decrease the opcode space to 32 instructions. This was adequate 
for AVM-1; the top 32 instruction are reserved for a future co-processor. 


The JMP.ul Instruction. This microinstruction changes the program counter to 
the new value (computed by adding R[a] to imm) if the jump conditions are met. 
The microprogram counter is set so that control returns to the beginning of the 
microprogram (FETCH_ADDR is the address of the FETCH instruction). 
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\~itf JMP.ul rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) 
(int_e, reset.e) * 

let a = EL (reg.len rep (srca rep ir)) reg and 
i = imm rep ir and 
d * reg.len rep (dest rep ir) in 
let result = add rep (a, i) in 
let jump.cond * JUMP.COND rep d psw in 
let new.pc = (jump.cond — ► result | pc) and 
new.mpc * 'FETCH.ADDR in 

(reg, psw, new.pc, mem, ivec, ir, mar, mbr, new.mpc) 


The boolean valued jump.cond is calculated using the function JUMP.COND defined 
in section 5.2.2. 


The ADD _ul Instruction. This microinstruction adds the values in R [a] and R [b] 
and stores the result back into R[d]. In addition, the program status word is 
updated to reflect the status of the ALU after the operation, and the microprogram 
counter is loaded with the address of the FETCH microinstruction. 


I - dt f ADD.ul rep (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) 
(int.e, reset.e) = 

let a * EL (reg.len rep (srca rep ir)) reg and 

b * El (reg.len rep (srcb rep ir)) reg and 

d = reg.len rep (dest rep ir) in 
let result = (add rep (a, b)) in 
let cflag = addp rep (a, b, result) and 
vflag * aovfl rep (a, b, result) and 
nflag * negp rep result and 
zflag = zerop rep result and 
sm * get.sm rep psw and 

ie = get.ie rep psw in 

let new.reg * UPDATE.REG rep psw d reg result and 
new.psw * 

mk.psw rep (bjh, ie, vflag, nflag, cflag, zflag) and 
new_mpc = “FETCH.ADDR in 

(new.reg, new.psw, pc, mem, ivec, ir, mar, mbr, new.mpc) 


The Instruction List. Once we have defined all of the state transition func- 
tions denoting the microinstructions, we can put them together in a list suitable 
for use with the generic interpreter theory. The micro-level instruction set is repre- 
sented by a list of pairs, where the first member of the pair is the key for selecting 
it (the location of the microinstruction in the microrom in this case) and the second 
member of the pair is the state transition function. 
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h dtf micro_inst_list rep * 

[((F.F.F.F.F.F), (FETCH rep)); 
((F,F,F,F,F,T), (ISSUE rep)); 
((F.F.F.F.T.F), (DECODE rep)); 
((F,F,F,F,T,T), (NOOP.ul rep)); 
((F,F,F,T,F,F), (JMP_ul rep)); 


((F,T,F,T,F,F) , (ADD.ul rep)); 


((T,T,T,T,T,T) , (NOOP.ul rep))] 


Defining select. At the micro— level, we will view each location in the microrom 
as constituting a new instruction. Actually this is not true since there are a several 
instructions in the microrom that appear more than once. In fact, the no-operation 
instruction appears 13 times. Due to the largely horizontal nature of the microcode, 
however, most of the instructions are unique. Because we are treating each location 
in the microrom as a separate instruction, we select the next instruction to execute 
using the value of the microprogram counter. 


\- it} GetMPC (reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) 
(int_e, reset.e) = mpc 


Defining key. Key transforms a key into a number. The microprogram counter 
is represented by a boolean 6-tuple, so the tuple function bt6_val serves as the 
representation for key. 


Defining substate. At the micro-level, substate is a function for performing the 
data abstraction on the phase— level state to produce the micro-state tuple shown 
earlier: 


h 4 t f Phase.Sub state rep 

(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc, 
alatch, blatch, ireq.ff, iack.ff, mir, urom, elk) = 
(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) 


Defining subenv. The environment is identical at the micro-level and the phase- 
level; therefore, the subenv function is represented using the built-in identity func- 
tion, I. 
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Table 5.19: The functions used to instantiate the abstract represen- 
tation of the generic interpreter theory for the micro- 
level. 


Operation 

Instantiation 

instJist 

micro_inst_list 

key 

bt6_val 

select 

GetMPC 

substate 

Phase-Substate 

subenv 

I 

Impl 

Phase_Int 

clock 

GetPhaseClock 

begin 

PhaseClockBegin 


Defining the Micro— Level Interpreter. The definitions given in this section 
(along with selected definitions from the previous section) are sufficient to instanti- 
ate the interpreter definition in the generic interpreter theory. Table 5.19 shows the 
functions used to instantiate the abstract representation for the micro-level. Just 
as we did at the phase-level, we can use these definitions to produce a top-level 
specification of the interpreter at the micro-level: 


Micro.Int = 

I- Micro_Int rep s e = 

(V t. 

s(t + 1) = 

SND (EL (bt6_val(GetMPC(s t)(e t))) (micro_inst_list rep)) 
(s t) 

(e t)) 


5.2.6 Defining the Macro-Level. 

The macro-level is the topmost specification in our hierarchy — making it the most 
abstract. The macro-level specification is a formal specification of what one would 
generally find in a programmer’s manual for a microprocessor (see Section 5.1.1). 
The specification describes the effect of each of the macro-level instructions on the 
processor’s state and defines how the instructions are selected. The major differ- 
ence between the formal specification of the microprocessor and the programmer’s 
manual is that the formal specification is less ambiguous and usually more concise. 
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Defining the Macro— Level State. The state-tuple that describes the macro- 
level interpreter state is an abstraction of the state-tuple describing the state of the 
micro-level. 

(reg, psw, pc, mem) 

These variables have the same meaning as they did at the micro-level. Note that 
registers such as the instruction register and the memory address register are no 
longer available. The only state visible at the macro-level is that seen by someone 
writing assembly code. 


Auxiliary Definitions. Before we specify the instructions, there are a few aux- 
iliary functions that are used to define the behavior of almost every instruction. 


\~dtf GetSrcA rep pc mem = 

reg.len rep (srca rep (fetch rep (mem, address rep pc))) 
\~4ef GetSrcB rep pc mem * 

reg.len rep (srcb rep (fetch rep (mem, address rep pc))) 

\~dtf Getlmm rep pc mem = 

(imm rep (fetch rep (mem, address rep pc))) 

GetDest rep pc mem = 

reg.len rep (dest rep (fetch rep (mem, address rep pc))) 


These functions return the values of the instruction fields from the word in memory 
pointed to by the program counter. Note that they reference memory and not 
the instruction register; at the macro-level, the instruction register is not visible. 
Also, not every instruction will use all of these auxiliary functions since the B and 
immediate fields overlap and some of the formats do not use all of the fields. 


Defining the Instruction List. We will not specify every instruction at the 
macro-level in this section, but will highlight a few example instructions. The 
complete specification for the macro-level is contained in [Win90b] . 


The JMP Instruction. The JMP instruction has a simple description. The value 
of the program status word and the contents of the destination field of the current 
instruction are used to determine if a jump should occur. If so, the program counter 
is loaded with the sum of the A register and the value of the immediate field from 
the current instruction. Otherwise, the program counter is incremented. 
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\~itf JMP rep (reg, pbw, pc, mem, ivec) = 

let a « EL (GetSrcA rep pc mem) reg and 
i ■ Getlmm rep pc mem and 
d = GetDest rep pc mem in 
let jump.cond * JUMP.COND rep d psw in 

let new.pc = (jump.cond — * (add rep (a, i)) I inc rep pc) in 
(reg, psw, new.pc, mem, ivec) 


The ADD Instruction. The ADD instruction adds the contents of the registers 
selected by the A and B fields in the current instruction and stores the result in 
the register selected by the destination field of the current instruction. In addition, 
the program status word is updated to reflect the results of the calculation and the 
program counter is incremented. 


\- itf ADD rep (reg, psw, pc, mem, ivec) = 

let a = EL (GetSrcA rep pc mem) reg and 

b ■ EL (GetSrcB rep pc mem) reg and 

d s GetDest rep pc mem in 
let result = add rep (a, b) in 
let cflag = addp rep (a, b, result) and 
vflag = aovfl rep (a, b, result) and 
nflag = negp rep result and 
zflag = zerop rep result and 
sm = get.sm rep psw and 

ie = get_ie rep psw in 

let new.reg = UPDATE.REG rep psw d reg result and 

new.psw * mk.psw rep (sm, ie, vflag, nflag, cflag, zflag) and 
new_pc * inc rep pc in 
(new.reg, new.psw, new.pc, mem, ivec) 


The EINT Instruction. The EINT instruction describes the behavior of the 
microprocessor upon an external interrupt. The selection criteria for the external 
interrupt instruction distinguishes it from the other instructions specified at this 
level. Every other instruction is selected based on the value of the opcode portion 
of the word in memory pointed to by the program counter; the EINT instruction 
is selected whenever the external interrupt line in the environment is set. Because 
its selection criteria differs substantially from that of the other instructions (and 
because an assembly language programmer would not really think of it as an in- 
struction) we term EINT a “pseudoinstruction.” Even though we have not described 
the implementation of this instruction in earlier sections, we include it here because 
it has interest both in its own right and in showing how pseudoinstructions can be 
specified. 

Every state variable in the macro-level state except the interrupt vector is changed 
in the execution of the EINT instruction. The program status word is updated to 
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enter supervisory mode and disable further interrupts. The contents of the program 
co un ter axe pushed onto the supervisory stack, the supervisory stack pointer (SSP ) 
is incremented, and the program counter is loaded with the 8 least significant bits 
of the interrupt vector. 


b def EINT rep (reg, psw, pc, mem, ivec) * 
let cd = SSP _ REG reg and 
d = ssp.reg in 

let cflag = get.cf rep psw and 
vflag » get.vf rep psw and 
nflag ■ get_nf rep psw and 

zflag ■ get.zf rep psw and 

sm = T and 

ie s F in 

let new.psw = 

mk.psw rep (sm, ie, vflag, nflag, cflag, zflag) in 
let new.reg = UPDATE.REG rep new.psw d reg (inc rep cd) and 
new.pc * band rep (wordn rep 255, int_fetch rep ivec) and 
new.mem » store rep (mem, address rep cd, pc) in 
(new.reg, new.psw, new.pc, new.mem, ivec) 


Note that the value of the interrupt vector is retrieved using the int .fetch operation 
from the abstract theory. This is required because the interrupt vector is shared 
state. 


The Instruction List. Before defining the instruction list and the selection 
function for the macro-level, we must decide upon a representation for the keys. 
The instruction’s opcode seems particularly well suited to be used as the key since 
it uniquely identifies the instruction and is a natural part of the description of an 
assembly language. However, there is one instruction, EINT, that has no opcode. 
\\fg could assign an unused opcode to EINT, but this raises the issue of what to do 
if that opcode appears in a program. 

We chose to represent the keys at the macro-level using a coproduct of boolean 
five-tuples (:bt5) and the type containing exactly one object (:one). Left injec- 
tions on the type represent read instructions and right injections represent pseu- 
doinstructions. We chose boolean five— tuples because there were approximately 32 
instructions. There is only one pseudoinstruction, so tone, the type with only one 
member, was the logical choice for its representation. There was nothing special 
about associating : one with the pseudoinstructions; if there had been more than 
one pseudoinstruction, another representation (such as boolean n-tuples) would 
have worked. 

Another small hurdle in defining the instruction list is that since none of the in- 
structions used the environment vector, the state transition functions defined above 
take only one argument — the state. The second member of an instruction is defined 
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in the generic theory to take two arguments: the state and the environment. We 
define ABS_ENV, which takes a function of type : (macro _st ate — ► macro .state) 
and creates a function of type : (macro.state — ► macro.env — + macro_state) . 


I -*/ ABS.ENV f x y = f x 


We can now define the macro-level instruction list. Every instruction uses the 
environment abstraction function to give it the proper type. The keys readily distin- 
guish between the real instructions and the pseudoinstructions — clearly specifying 
the opcodes associated with each real instruction. 


macro_inst_list rep * 
[(INL(F,F,F,F,F) .ABS.ENV (JMP rep)); 


(INL(T,F,F,F,F), ABS.ENV (ADD rep)); 


(INR(one) , ABS.ENV (EINT rep) ) ; 

] 


Defining select. The instruction selection function Opcode uses the environment 
and the state to determine which instruction to execute. 


\~i t ] Opcode rep (reg, psw, pc, mem, ivec) (int.e, reset.e) = 

(int.e A (get.ie rep psw)) — ♦ 

INR(one) I 

INL(SND (opcode rep (fetch rep (mem, address rep pc)))) 


If the interrupt line in the environment is high and interrupts axe enabled, then 
the key associated with the external interrupt instruction, INR(one), is returned. 
Otherwise, a left injection of the 5 least significant bits of the opcode portion of the 
word in memory pointed to by the program counter is returned. 


Defining key. To instantiate the generic interpreter theory, we must be able to 
turn a key into a number that indexes the instruction associated with that key in 
the instruction list. The function Opc.Val performs that task: 


\~dtf Opc.Val (x:(bt5 + one)) * 

(ISL x) -> (bt5_val (OUTL x)) 
I 32 
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The function determines whether its argument is a left or right injection and then 
uses the appropriate function to return the value. Because there is only one possible 
right injection, we can return 32 without any further work. 


Defining substate. Micro.Substate is the function used to transform a micro- 
level state-tuple into the macro-level state tuple shown above. Micro.Substate 
is not as straightforward as the substating functions from the previous levels. In 
particular, the variables representing memory and the interrupt vector register both 
represent shared state. The interrupt vector register is shared with the interrupt 
controller and the memory is shared with a variety of devices. 


\~itf Micro.Substate rep 

(reg, psw, pc, mem, ivec, ir, mar, mbr, mpc) = 

(reg, psw, pc, trans rep mem, int.trans rep ivec) 


In Section 3.4, we discussed the specification of shared state. The definition 
of Micro.Substate presents a concrete example of the theory in application. The 
memory at the macro-level is a function of the memory at the micro-level. This 
function takes into account the changes that are occurring in memory due to the 
actions of other devices. In this way, the lower levels of the implementation can 
assume that they own memory without the top-level specification making the same 
assumption. The interrupt vector is handled similarly. As we will see in the verifi- 
cation of the macro-level, the use of the transformation functions on shared state 
leads to requirements in the proof that have very satisfying interpretations. 


Defining subenv. The environment is identical at the macro-level and the micro- 
level; therefore, the subenv function is represented using the built-in identity func- 
tion, I. 


Defining the Macro-Level Interpreter. The definitions given in this section 
(along with selected definitions from the previous section) are sufficient to instanti- 
ate the interpreter definition in the generic interpreter theory for the macro-level. 
Table 5.20 shows the functions used to instantiate the abstract representation. Just 
as we did at the micro-level, we can use these definitions to produce a top-level 
specification of the interpreter at the macro-level: 
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Table 5.20: The functions used to instantiate the abstract represen- 
tation of the generic interpreter theory for the macro- 
level. 


Operation 

Instantiation 

instlist 

macro_inst_list 

key 

Opc.Val 

select 

Opcode 

substate 

Micro_Substate 

subenv 

I 

Impl 

Micro.Int 

clock 

GetMPC 

begin 

FETCH _ADDR 


Macro. Int = 

I- Macro.Int rep s e = 

(V t. 

s(t + 1) = 

SND 

(EL(0pc_Val (Opcode rep(s t)(e t) )) (macro.inst.list rep)) 
(s t) 

(e t)) 


5.2.7 Observations. 

Having completed the formal specification of AVM-1, we have several observations. 

This section has shown how a variety of architectural and organizational features 
can be modeled using the generic interpreter theory. One should not assume that 
we are claiming that every architectural feature will map onto the models presented 
in Chapter 4. Indeed, many may not. For example, we have not explored the use 
of our generic interpreter theory in pipelined architectures. 

What does this say then for the utility of generic theories? Certainly, many 
interesting features, such as interrupts, can be mapped onto the models given in 
this dissertation. Furthermore, formalizing new models is not a difficult process. 
We expect that our models will change and new models will be developed to suit 
new features. The major utility of generic theories, structuring the proof, is not 
diminished. 

Each of the interpreter levels uses a different concept of “key.” The phase-level, 
for example, uses the value of a polyphase clock as the instruction key. The micro- 
level, on the other hand, uses location in memory as the key to select an instruction. 
The macro-level uses an opcode as the key. Thus a program that is thousands of 
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instructions long at the micro— level implies that there are thousands of instructions. 
A program that is thousands of lines long at the macro-level would still only uses 
the 30 instructions given here. 

Another interesting point concerning keys is their use at the macro-level to dis- 
tinguish between user instructions and pseudoinstructions. When specifying an 
interpreter, it is important to be flexible about the concept of an instruction. We 
would not have been able to model the external interrupts using the interpreter 
theory if we had not been willing to think of it as just another instruction that is 
selected using an environment signal instead of the program counter. 

The use of coproducts to specify the user instructions and pseudoinstruction 
keys also points out the utility of having a specification language that is powerful 
and expressive. Because HOL had coproducts, we were easily able to specify the 
distinction between these two types of instructions while continuing to use the 
opcode to select user instructions. 

The phase-level instructions perform the same action on every cycle. The only 
difference between one cycle and the next is the data in the microinstruction register. 
The phase-level could have been structured differently. We could have used the 
values in the microinstruction register to select among several instructions. For 
example, instead of selecting the second phase instruction when the clock was (F ,T) , 
we could take action conditioned upon the contents of the microinstruction register. 
This would have made the specification of the phase-level much more complex and 
subsequently increased the amount of effort required to establish the electronic block 
model to phase-level correctness result given in the next section. 

The specifications of the electronic block model and phase-level provide an inter- 
esting point of discussion. The results from the ALU and shifter are calculated in 
the fourth phase in the phase-level interpreter even though in the electronic block 
model the results are calculated in the third phase. The difference is that the cal- 
culations in the phase-level interpreter happen instantaneously (from the state’s 
perspective) and are therefore calculated and stored in the last phase. The phase 
when the values are stored is what is important in verifying that the phase-level 
implies the electronic block model, not the phase when they are calculated. This 
is a good example of the kind of design mistake that our model will not catch. 
Because we are not concerned with gate delays, there is no way to model that the 
result from the ALU will not be available for some time after the latches are loaded. 
We probably could have left the third phase out of the design and still verified that 
the design was correct; obviously, the chip would not have worked even though the 
design was verified. 

In order to deal with timing issues, gate-delays would have to be built into the 
models. There is nothing to keep us from building specifications that model gate- 
delay; however, the models would be more complex and the verification that much 
more difficult. Given current state— of— the— art, it is probably better to leave timing 
analyses to CAD systems for VLSI layout. In time, the timing analysis may also 
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be done in the formal system; but for now, it seems prudent to let the CAD tools 
do what they do well and let formal verification so what it does well, namely verify 
abstract functionality of structural descriptions. 

One of the merits of an abstract specification cam be clearly seen in the phase- 
level specification. The interrupt request environment signal, ireq.e, is latched 
into the interrupt flip-flop in the datapath during the first phase. The value of 
the flip-flop is not used until the fourth phase when its contents are used by the 
MPCJJNIT to calculate the new contents for the microprogram counter. One could 
legitimately ask why the line is latched so early. The point of this discussion is not 
to debate that issue, but to point out that the phase-level specification is a useful 
tool for exploring these kinds of design issues. The circuit diagram and specification 
of the electronic block model contain this information, but it is more difficult to 
extract. 

Each level in the decomposition hierarchy corresponds to a real level in the micro- 
processor. We could introduce levels that do not correspond to these real levels. For 
example, we might add an additional level of abstraction between the micro-level 
and phase-level to reduce the size of the instruction set that we have to use at the 
micro-level. This is an area that needs further exploration. 

The specification of interrupts is incomplete until the specification presented in 
this chapter is composed with a priority interrupt controller that receives signals 
from devices and sets the interrupt vector accordingly. 

The specification treats ivec as a piece of state. Actually, the specification never 
changes the value of ivec and it seems that it could probably be treated as a 
member of the environment rather than the state. There is no set rule about what 
should be in the environment and what should be in the state, but in general, the 
environment is a good place to put signals that are read-only. A respecification 
of AVM-1 should place ivec in the environment instead of the state. This would 
simplify the specification since we would not have to treat ivec as shared state. 
More importantly, the composition of AVM-1 with a priority interrupt controller 
would be easier. 
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5.3 AVM-V s Formal Verification 


Microprocessor verification involves generating a correctness result of the form 
h Structure =$■ Behavior 

from the microprocessor’s structural and behavioral specifications. The specifica- 
tions presented in the last section were written in the object language of HOL and 
can be manipulated in the HOL system. 

The opening part of this section describes how the generic interpreter theory can 
be used to prove the correctness of the macro-level with respect to the electronic 
block model using the hierarchical decomposition presented in the last section. 

Next, each level in the proof is examined, showing in detail how the proof of 
correctness for that level was obtained. The three levels are interesting in that 
different methods of proof were necessary in each. 

The last part of this section describes how the proofs of correctness for the three 
levels can be combined into a overall proof of correctness for AVM-1. 


5.3.1 Instantiating the Generic Interpreter Theory. 

Before describing the actual instantiations of the generic interpreter theory, we 
discuss exactly what we hope to gain by this instantiation. Figure 5.11 shows how 
a combination of the generic interpreter and the definitions leads to specifications 
for the three interpreter levels. We want more than a description however, we want 
a verified correctness statement. 

The diagonal lines in from the interpreter specification at one level to the defini- 
tions at the level above represent the proofs that must be done to satisfy the theory 
obligations. Because of 

1. the definitional relationship between the generic interpreter and the specifica- 
tion on one level and 

2. the theory obligations relating the implementation and the definitions between 
levels, 

we can conclude that the electronic block model implies the phase— level, the phase- 
level implies the micro-level, and that the micro-level implies the macro-level. 
Using these theorems we can prove a result about the overall correctness of our 
microprocessor . 
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Figure 5.11: The generic interpreter theory can be instantiated with defini- 
tions of the various levels from the hierarchical decomposition 
to yield a proof of the microprocessor. 

In the sections that follow, we will be instantiating the generic interpreter proof 
to provide the desired correctness lemmas at each level. In each case, we will follow 
the following plan: 

1. Instantiate the generic interpreter definition, providing a specification of the 
interpreter at that level. 

2. Instantiate the generic correctness predicate so that it can be used in the 
proofs of the theory obligations. 

3. Prove the three theory obligations for the instantiation. This step constitutes 
the bulk of each section that follows. 

4. Using the proofs of the theory obligations, instantiate the correctness result 
from the generic theory. 

The sections that follow will be divided into subparts roughly corresponding to this 
plan. 

The instantiations, for the most part, are done by calling functions defined in the 
library package abstract which is discussed in Appendix A. We will describe the 
functions from that package as they are used. All of the instantiation functions are 
secure; that is, they do their work entirely through primitive inference in the object 
world of HOL. 
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Table 5.21: The functions used to instantiate the abstract represen- 
tation of the generic interpreter theory for the phase- 
level. 


Operation 

Instantiation 

inst_list 

list of phase instructions 

key 

bt2_val 

select 

GetPhaseClock 

substate 

The identity function, I 

subenv 

The identity function, I 

Impl 

EBM 

count 

GetEBMClock 

begin 

EBM-Begin 


5.3.2 Verifying the Phase Level. 

We would like to show that the phase-level is implemented by the electronic block 
model. Logically, this amounts to showing that the electronic block model implies 
the phase-level by proof. 

Table 5.21 gives the concrete functions used to instantiate the generic interpreter 
theory at this level. These functions were all defined in Section 5.2 with the excep- 
tion of bt2_val which gives a numerical value to a boolean 2-tuple. 


The Definition. The definition of the phase-level specification was given in Sec- 
tion 5.2.3. Using the function for instantiating definitions from the abstract package 
and expanding the let terms in the result we get the following theorem: 


h Phase. I rep s e = 

(V t. 

s(t + 1) = 

SND 

(EL 

(bt2_val(GetPhaseClock(s t)(e t))) 
[(F,F) .phase.one rep; 

(F,T) .phase.two rep; 

(T,F) .phase.three rep; 

(T,T) .phase.four rep]) (s t) (e t)) 


This theorem defines the phase-level interpreter by relating the state at time t + 1 to 
the state and environment at time t. The relationship is based on the n member 
of the instruction list where n is calculated from the phase-level clock. 
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The Correctness Predicate. After instantiating the top-level specification for 
the phase-level, we instantiate the instruction correctness predicate for the phase- 
level. Each of the phase-level instructions must satisfy this predicate if we sire 
to meet the theory obligations. We first apply the generic instruction correctness 
definition INSTRUCTION_CORRECT to the concrete representation given in Table 5.21. 


\~d e f Phase_Int_Inst_Correct rep s’ e’ * 
IHST.CORRECT 

( C(F,F) ,phase_one rep; 

(F,T) ,phase_two rep; 

(T,F) ,phase_three rep; 

(T,T) ,phase_four rep] , 
bt2_val, 

GetPhaseClock rep, 

1 . 1 . 

EBM rep, 

GetEBMClock rep,EBM_Start) s’ e’ 


After calling the function for instantiating definitions from the abstract package 
and some minor manipulation we get a predicate that can be used in subsequent 
proofs. 


Phase_Int_Inst_Correct = 

I- Phase_Int_Inst_Correct rep s’ e’ p = 

EBM rep s’ e’ => 

(V t. 

(GetPhaseClock rep (s’ t)(e’ t) = FST p) A 
(GetEBMClock rep(s’ t)(e’ t) “ EBM.Start) =$► 

(3 c. 

Next(A t’. GetEBMClock rep(s’ t*)(e* t’) ■ EBM.Start) (t,t + c) A 
(SND p (s’ t) (e’ t) » 8 ’ (t ♦ c))) 


It is interesting to compare this version of the instruction correctness predicate with 
the generic one. The structure is the same, but the names have changed. 


The Theory Obligations. There are three theory obligations that we are re- 
quired to meet before we can instantiate the generic theory. 

1. We must show that each instruction in the phase-level specification is correct 
with respect to the electronic block model. Specifically, we must prove that the 
instruction correctness predicate, Phase_Int_Inst_Correct is true for every 
instruction in the phase-level specification. 

2. We must show that every key selects an instruction. 

3. We must show that every key selects the right instruction. 
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The Instruction Correctness Lemma. To establish the first theory obliga- 
tion for the generic interpreter theory, we will prove that the phase-level instruction 
correctness predicate applies to each of the phases and then use these results to es- 
tablish that the predicate applies to every instruction. 

In order to prove the correctness lemma, we will need a lemma about Next. 
NEXT -LEMMA states the following: 


NEXT.LEMMA = 


t- V t. t < (t + 1) A (V t’ . 

"(t < t* A t* < (t + 1))) 


This is a special form of the Next predicate when the existential variable is 1. It says 
that t is less the t + 1 and that no natural number exists between t and t + 1. 

The following theorem says that the instruction correctness predicate applied to 
the first instruction, phase.one, is a tautology. 


PHASE_ONE_EBM_LEMMA = 
h Phase_Int_Inst_Correct rep 
(A t. 

(reg t,psw t,pc t,mem t.ivec t,ir t, 
mar t.mbr t.mpc t.alatch t.blatch t, 
ireq.ff t.iack.ff t,mir t,urom,clk t)) 
(A t. (ireq_e t)) 

( (F , F) ,phase_one rep) 


We proved the instruction correctness lemma for the first phase using the following 
tactic: 


PURE. ONCE. REWRITE.TAC [Phas e_ Int .Inst .Correct] 

THEN REPEAT GEN.TAC 
THEN BETA.TAC 

THEN REWRITE.TAC [GetPhaseClock;Next ; 

GetEBMClock ;EBM_Start ;phase_one_def ;] 

THEN SUBST.TAC [EBM.expanded] 

THEN REPEAT STRIP.TAC 
THEN POP.ASSUM.LIST 

(A asl. (MAP.EVERY (STRIP.ASSUME.TAC o SPEC.ALL) asl)) 
THEN EXISTS.TAC "1" 

THEN ASM.REWRITE.TAC [PAIR.EQ ;NEXT_LEMMA] 


This tactic performs the following actions: 


1. Rewrite with the definition of the instruction correctness predicate. 

2. Strip the universal quantification using GEN.TAC. 
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3. Beta reduce the goal to remove the lambda expressions using BETA _T AC. 

4. Rewrite with the definitions of functions from the instantiation and the defi- 
nition of the first phase. 

5. Substitute the expanded form of the electronic block model definition. The 
expanded form has all of the definitions completely expanded and is about 4 
pages long. Substitution does not perform unification the way that rewrit- 
ing does and is thus faster than rewriting. Substitution is sufficient for our 
purposes. 

6. Strip the antecedent of the implication (the expanded form of the electronic 
block model) and place it in the assumption list. 

7. Break the expanded form of the electronic block model into the definitions 
of the individual blocks using STRIP_ASSUME_TAC. This tactic picks arbitrary 
constants for the existential variables and then splits any conjunctions into 
two assumptions. 

8. Pick a witness for the existential variable in the instruction correctness pred- 
icate. For this level, finding an existential witness is easy; because there is no 
temporal abstraction taking place, the existential variable is always 1. 

9. Rewrite using the assumptions, NEXT_LEMMA, and a theorem about the equality 
of pairs. 

The above tactic only proves the first instruction correctness lemma. The tactics 
to prove the other instructions at the phase-level are more involved than this one. 
The tactic that proves the fourth phase is quite long. 

The instruction correctness lemma is difficult to prove at the phase-level since 
every instruction requires a different proof. As we will see, at the micro and macro- 
levels, one tactic suffices to prove every instruction correctness lemma. We will not 
show all of the proofs for the phase-level here, but they are contained in [Win90b] . 

After we have shown that the instruction correctness predicate is true for each of 
the instructions, we cam show that it is true for every instruction. This satisfies the 
first theory obligation. 
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Phase.Int.Correct.LEMMA * 

I- EVERY 

(Phase. Int.Inst.Correct rep 
(A t. 


(reg t.psw t,pc t.mem t.ivec t,ir t,mar t.mbr t, 
mpc t.alatch t.blatch t,ireq_ff t.iack.ff t, 
mir t.urom.clk t)) 

(A t. (ireq.e t))) 

[(F,F) .phase.one rep; 

(F,T) .phase.two rep; 

(T,F) ,phase_three rep; 

(T,T) ,phase_four rep] 


The Length Lemma. The second theory obligation is easy to show. The 
theorem says that the numeric value of a boolean 2-tuple is always less than the 
length of a four element list. 


Phas e _ Int _LEN GTH .LEMMA = 


h bt2_val elk < (LENGTH 

[(F,F) .phase.one rep; 


(F.T) .phase.two rep; 


(T.F) .phase.three rep; 


(T.T) .phase.four rep]) 


The Order Lemma. The third theory obligation says that the numeric value 
of the first part of the pair denoting an instruction is the index of that instruction 
in the instruction list (i.e. that the list is correctly ordered). 


Phase.Int.ORDER.LEMMA = 
h elk = FST (EL (bt2_val elk) 

[(F.F) .phase.one rep; 
(F,T) .phase.two rep; 
(T.F) .phase.three rep; 
(T.T) .phase.four rep]) 


This lemma is also quite easy to show by case analysis. 


Instantiating the Correctness Theorem. Having proven the theory obliga- 
tions, we can instantiate the generic interpreter theory. The function from the 
abstract package which does this takes several arguments. 

1. The name of the theory to instantiate. 
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2. A list of the le mm as proving the theory obligation. 

3. A list of substitutions for the parameters in the generic theorems. These 
substitutions take the form of a pair, where the first member of the pair 
gives the variable to specialize and the second gives the term with which to 
specialize it. 

4. A string to prepend to the names of the theorems resulting from the instan- 
tiation. This is used to prevent name clashes with the names in the generic 
theory. 

The instantiation of the generic interpreter theory for the AVM-1 microengine is 
shown in Figure 5.12. 


The Final Result. The result of the instantiation can be simplified through 
minor rewriting and beta reduction. In particular, we note that the temporal ab- 
straction function, Temp_Abs is equivalent to the identity function at this level since 
the clock for the electronic block model and the phase-level are the same. 


Temp_Abs_DEGENERATE = b Temp_Abs(A t. T) = I 


Using the last theorem and a few minor manipulations, the result of the instanti- 
ation is a correctness result for the electronic block model and phase-level becomes: 


PHASE.LEVEL.CORRECT. LEMMA = 
b EBM rep 

(A t. 

(reg t.psu t,pc t.mem t.ivec t,ir t, 
mar t,mbr t.mpc t,alatch t.blatch t, 
ireq_ff t,iack_ff t.mir t.urom.clk t)) 
(A t. (ireq.e t)) => 

Phase.Int rep 

(At. 

(reg t.psw t,pc t.mem t.ivec t,ir t, 
nar t.mbr t.mpc t.alatch t.blatch t, 
ireqjff t,iack_ff t.mir t.urom.clk t)) 
(A t . (ireq.e t) ) 


This result is the same theorem that we would have proven about the phase-level 
and the electronic block model if we had not used the generic interpreter theory. The 
result says that the electronic block model implies the phase-level for the concrete 
state and environment in our model. The result is a little cleaner than the proofs of 
correctness for other levels since it does not include a temporal projection function 
and there are no assumptions. 


148 







let theorem.list = 

instantiate.abstract.theorems 

‘gen.I* 

[Phase.Int.Correct.LEMMA ; 
Phase.Int.LENGTH.LEMMA ; 

Phas e. Int _ ORDER. LEMMA] 

[ 

("repi'I.rep.ty" , 

"(C(F,F) ,phase_one rep; 

(F,T) .phase.two rep; 

(T,F) ,phase_three rep; 

(T,T) .phase.four rep] , 
bt2_val, 

GetPhaseClock rep, 

I, 

I. 

EBM rep, 

GetEBMClock rep"); 

("e’ :time ’ ->*env’ " , 

"(A t. (ireq.e t))") ; 

("s’ :time->*state’ " , 

"(A t. (reg t, psw t, pc t, mem t, ivec t, 
ir t, mar t, mbr t, mpc t, 
alatch t, blatch t, ireq.ff t, 
iack.ff t, mir t, urom, elk t))"); 

] 

‘PHASE'; ; 


Figure 5.12: Instantiating the abstract theory for the phase-level. 
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5.3.3 Verifying the Micro-Level. 

The verification of the micro-level is at once the most straightforward and the 
largest of the proofs presented here. In the proof of correctness for this level, we are 
showing that the phase-level specification implements the micro-level specification. 
Again, we do this by instantiating the generic interpreter proof. 

The instantiation is possible even though the implementation for this level, the 
phase-level specification, is vastly different in structure from the implementation in 
the proof we just completed. The electronic block model is a structural specification 
and the phase-level is a behavioral, interpreter-based specification. The reason that 
these two different types of specifications can be used in the instantiation for the 
implementation is that the generic interpreter theory places very few restrictions 
on the abstract operator representing the implementation. 

Table 5.22 gives the concrete functions used to instantiate the generic interpreter 
theory at this level. These functions were all defined in Section 5.2 with the excep- 
tion of bt6_val which gives a numerical value to a boolean 6-tuple. 


The Definition. The definition of the micro-level specification was given in Sec- 
tion 5.2.5. Using the function for instantiating definitions from the abstract package 
and expanding the let terms in the result we get the following theorem: 


Micro.Int = 
h Micro.Int rep s e = 

(V t. 

s(t + 1) = 

SND 

(EL (bt6_val (GetMPC (s t) (e t))) 
(micro_inst_list rep)) 

(s t) 

(e t)) 


This theorem defines the micro-level state at time t + 1 in terms of the state at 
time t using the instruction in the instruction fist selected by the current value of 
the microprogram counter. 


The Correctness Predicate. After instantiating the top-level specification for 
the micro-level, we instantiate an instruction correctness predicate for the micro- 
level. Each of the micro-level instructions must satisfy this predicate if we are 
to meet the theory obligations. We first apply the generic instruction correctness 
definition IN STRUCT I ON -CORRECT to the concrete representation given in Table 5.22. 
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Table 5.22: The functions used to instantiate the abstract represen- 
tation of the generic interpreter theory for the micro- 
level. 


Operation 

Instantiation 

instJist 

micro_inst_list 

key 

bt6_val 

select 

GetMPC 

substate 

Phase_Substate 

subenv 

I 

Impl 

Phase.Int 

clock 

GetPhaseClock 

begin 

PhaseClockBegin 


\~d e f Micro.Int.Inst. Correct rep s e = 

INST.CORRECT 

(micro. inst.list rep, 
bt6.val, GetMPC, 

Phase.Substate rep, I, Phase. Int rep, 
GetPhaseClock rep, PhaseClockBegin) s e 


After applying the function for instantiating definitions from the abstract package 
and some minor manipulation we get a predicate that can be used in subsequent 
proofs. 


Micro.Int.Inst.Correct = 
h Micro.Int.Inst.Correct rep s e p = 

Phase. Int rep s e =>> 

(V t. 

(GetMPC(Phase. Substate rep(s t))(e t) = FST p) A 
(GetPhaseClock rep(s t)(e t) = PhaseClockBegin) => 

(3 c. 

Next 

(A t*. GetPhaseClock rep(s tO(e t * ) * PhaseClockBegin) 
(t,t + c) A 

(SND p(Phase.Substate rep(s t))(e t) « 

Phase. Substate rep(s(t + c))))) 


The instruction correctness predicate for the micro-level looks very similar to the 
instruction correctness predicate for the phase-level; only the names are different. 
This should not come as a surprise since they were generated by instantiating the 
same generic definition. 
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The Theory Obligations. Just as we did at the phase-level, we must meet the 
three theory obligations of the generic theory before we can instantiate it. 


The Instruction Correctness Lemma. The first theory obligation for this 
instance of the generic interpreter theory is that Micro_Int_Inst_Correct applies 
to every instruction in micro_inst_list. We do this by case analysis, first showing 
that it applies to each instruction in the instruction set, and then using those lemmas 
to show that it applies to every instruction. 

There are 64 instructions at the micro-level. In order to prove this large number 
of lemmas, we use the meta-language of HOL, ML, to automate most of the proof. 
We write an ML function that when applied to a number, returns the instruction 
correctness lemma for the instruction in micro_inst_list corresponding to that 
number. This function is mapped onto a list of numbers from 0 to 63 to create a 
list of lemmas — one for each instruction in the list. The regularity of the proof for 
the micro-level makes this possible. 

The first step is to write a function to produce the desired goal. The following 
function, when applied to a number, returns the goal for the instruction correspond- 
ing to that number. 


let MK_INST.CORRECT.GOAL n = 
let inst = term.list.el n 
(snd(dest_eq( 

snd(dest_forall(concl micro.inst.list))))) in 
"V (repr'rep.ty) reg mem 

psw pc ivec ir mar mbr alatch blatch 
mpc elk urom mir ireq.ff iack.ff int.e. 

(V p. mk.psw rep 

(get.sm rep p.get.ie rep p.get.vf rep p, 
get.nf rep p.get.cf rep p.get.zf rep p) * p) =► 
Micro.Int.Inst.Correct rep 

(A t. (reg t,psw t,pc t,mem t, 

ivec t,ir t,mar t,mbr t,mpc t, 

alatch t, blatch t, ireq.ff t, iack.ff t, 

mir t, micro.rom, elk t)) 

(A t. (int.e t)) *inst";; 


For example, when applied to 4, MK.INST.CORRECT.GOAL returns the goal for 
the JMP.ul microinstruction: 
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n V (rep: ~rep.ty) reg mem 

psw pc ivec ir »ar mbr alatch blatch 
mpc elk urom mir 
ireq.ff iack.ff int.e. 

(V p. mk.psw rep 

(get.sm rep p,get.ie rep p.get.vf rep p, 
get.nf rep p,get_cf rep p,get.zf rep p) * p) => 
Micro.Int. Inst. Correct rep 

(A t. (reg t,psw t,pc t,mem t, 

ivec t,ir t,mar t,mbr t,mpc t, 

alatch t, blatch t, ireq.ff t, iack.ff t, 

mir t, micro. rom, elk t)) 

(At. (int.e t)) ((F,F,F,F,F,F) , JMP.ul rep)”;; 


In order to establish the correctness of the micro-level, the goal contains an as- 
sumption about the abstract word representation: 


V p. mk.psw rep 

(get.sm rep p,get.ie rep p,get.vf rep p # 
get.nf rep p,get.cf rep p,get.zf rep p) = p 


This assumption requires that the constructors and selectors for the program status 
word be mutually consistent. 

We c an solve goals of this form though symbolic execution. As we said, the regu- 
larity of the goals allows us to write a single tactic that solves all 64 microinstruction 
correctness goals. The complete tactic is to large to include here; it can be found 
in [Win90b] . Rather than include it, we will describe the theory behind how the 
tactic works. 

We will establish an intermediate le mm a for each instruction at the phase-level to 
aid in the symbolic execution. This lemma gives relationships between the various 
state variables at time t and t + 1 provided that the phase-level interpreter is valid 
and the clock selects that instruction at time t. For example, for phase-one, we can 
easily prove the following lemma using the definition of the phase-level interpreter 
and the definition of the first instruction at the phase-level. 
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PHASE.ONE.LEMMA * 

I- Phase. Int rep 

(A t. (reg t.psw t,pc t,mem t.ivec t,ir t,mar t, 
mbr t.mpc t.alatch t.blatch t,ireq_ff t, 
iack.ff t,mir t,urom,clk t)) 

(A t. (int_e t)) => 

(V t. (elk t = F.F) =► 

(reg(t + 1) * (reg t)) A 

(psw(t + 1) * (psw t)) A 

(pc(t + 1) * (pc t)) A 

(mem(t + 1) * (mem t)) A 

(ivec(t + 1) * (ivec t)) A 

(ir(t + 1) = (ir t)) A 

(mar(t + 1) * (mao- t)) A 

(mbr(t + 1) = (mbr t)) A 

(mpc(t + 1) ■ (mpe t)) A 

(alatch(t + 1) * (alatch t)) A 

(blatch(t + 1) = (blatch t)) A 

(ireq_ff(t + 1) = (ireq.ff t)) A 

(iack_ff(t + 1) = (iack.ff t)) A 

(mir(t + 1) * (urom(bt6_val(mpc t)))) A 

(clk(t ♦ 1) » (F,T))) 


Note that the selection is based on the clock being (F,F) for phase-one. Just as 
we expect from the definition of phase-one, the mir and elk are the only variables 
that change. We would also establish PHASE_TWO_LEMMA, PHASE.THREE.LEMMA, and 
PHASE_FOUR_LEMMA. These cam all be proven using a single inference rule. 

Now we turn our attention to the symbolic execution that establishes the instruc- 
tion correctness lemma. We begin by stripping the universally quantified variables 
and the antecedents of the implication from the goad and rewriting it with the def- 
inition of the instruction correctness predicate for the micro-level, the definition of 
the microinstruction, and the definition of Next. This is the result: 
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(t < (t + c) A (V t* . t < t* At’ < (t + c) =J> 

-i (At", elk t" = F,F)t’) A 
(A t». elk t‘ * F,F)(t ♦ c)) A 
(reg t.psv t, 

(JUMP.COND rep(reg_len rep(dest rep(ir t)))(psw t) => 

add rep(EL(reg_len rep(srca rep(ir t)))(reg t),imm rep(ir t)) I 
pc t), 

mem t.ivec t,ir t,mar t,mbr t,F,F,F,F,F,F = 

reg(t + c),psw(t + c),pc(t + c) ,mem(t + c),ivec(t + c) , 

ir(t + c),mar(t + c),mbr(t + c),mpc(t + c)) 

[ "V p. mk_psw rep 

(get.sm rep p,get_ie rep p,get_vf rep p, 
get.nf rep p,get_cf rep p,get_zf rep p) = p" ] 

[ "Phase.Int rep 

(A t. (reg t,psw t,pc t,mem t.ivec t,ir t,mar t,mbr t.mpc t, 

alatch t,blatch t,ireq_ff t,iack_ff t,mir t,micro_rom,clk t)) 
(A t. (int_e t))" ] 

[ "mpe t = F,F ,F ,T,F ,F" ] 

[ "elk t = F,F" ] 


The assumption list holds the antecedents of the implication in PHASE_ONE_LEMMA. 
We can resolve PHASE_ONE_LEMMA with the assumptions using Modus Ponens to 
perform one step in the execution. The results are put back on the assumption list. 


c 

"reg(t + 1) = reg t" ] 


[ 

"psw(t 4 1) * psw t" ] 


[ 

n pc(t + 1) = pc t" ] 


[ 

"mem(t 4 1) = mem t" ] 


[ 

"ivec(t + 1) ■ ivec t" ] 


[ 

"ir(t + 1) * ir t" ] 


[ 

"mar(t ♦ 1) * mar t" ] 


c 

"mbr(t + 1) = mbr t" ] 


[ 

"mpc(t + 1) = F,F,F,T,F,F" 

] 

[ 

"alatch(t + 1) * alatch t" 

] 

[ 

"blatch(t + 1) = blatch t" 

] 

c 

"ireq_ff(t + 1) = int_e t" 

] 

c 

"iack_ff(t + 1) = iack.ff ■ 

t" ] 

c 

"mir(t + 1) =(F, (T ,T) , (F ,F 

,F,F),F,F,F,(T,F,T),(F,F f F),T,F), 


(F,F,F,F,F,F,F,F,F),(F,F,F,F),(F,F,T),F,F,F,F,F,F" ] 

[ 

"clk(t + 1) = F,T" ] 



Note that the value of urom has been expanded so that the microinstruction reg- 
ister holds the actual bit string for the microinstruction currently selected by the 
microprogram counter. Also note that the clock, elk, has advanced to (F,T). 

We can now use PHASE_TWO_LEMMA to symbolically execute the second phase. We 
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resolve it with the assumption that the phase— level is valid and the clock at time 
t -f 1 to obtain the following step-wise changes to the phase-level state. 


[ "reg((t + 1) + 1) * regCt + 1)" ] 

[ "psw((t + 1) ♦ 1) ■ psw(t + 1)" ] 

[ "pc((t ♦ 1) + 1) - pc (t + 1)" ] 

[ + 1) + 1) = mem(t + 1)" ] 

[ "ivec((t + 1) + 1) = ivec(t ♦ 1)" ] 

[ "ir((t + 1) + 1) = ir(t + 1)" ] 

[ "mar((t + 1) + 1) = mar(t + 1)" ] 

[ "mbr((t ♦ 1) ♦ 1) ■ mbr (t + 1)" ] 

[ "mpc((t ♦ 1) + 1) ■ mpc(t + 1>" ] 

[ "alatch( (t + 1) + 1) = 

EL(reg_len rep(srca rep(ir(t + l))))(reg(t + 1))" ] 

[ "blatch((t + 1) + 1) = imm rep(ir(t ♦ 1))" ] 

[ "ireq_ff((t + 1) + 1) = ireq_ff(t + 1)" ] 

[ ""iack_ff((t + 1) + 1)" ] 

[ "mir((t + 1) + 1) = (F, (T,T) , (F,F,F,F) , F,F,F, (T,F,T) , (F,F,F) ,T,F) , 
(F,F,F,F,F,F,F,F,F),(F,F,F,F),(F,F,T),F,F,F,F,F,F" ] 

[ "clk((t +!)+!)= T,F" ] 


The alatch and blatch have been loaded at time (t + 1) + 1 just as we expect and 
the clock has advanced to (T,F). 

To execute the third phase, we resolve PHASE-THREE-LEMMA with the assumption 
list and add the changes that occur during phase-three to the assumption list. 


[ "reg(((t + 1) ♦ 1) ♦ 1) - reg((t ♦ 1) + 1)" ] 

[ "psv(((t + 1) + 1) ♦ 1) ** psn((t + 1) + 1)" ] 

[ "pc(((t ♦ 1) ♦ 1) ♦ 1) » pc ( (t + 1) + 1)" ] 

[ M mem(C(t + 1) + 1) + 1) = mem((t + 1) + 1)" ] 

[ "ivec(((t + 1) + 1) + 1) = ivec((t + 1) ♦ 1)" ] 

[ "ir(((t ♦ 1) ♦ 1) ♦ 1) ■ ir((t + 1) + D" ] 

[ "mar(((t + 1) + 1) + 1) * mar((t ♦ 1) + 1)" ] 

[ "mbr ( ( (t ♦ 1) ♦ 1) + 1) - mbr((t + 1) + D" ] 

[ "mpc(((t + 1) ♦ 1) ♦ 1) « mpc((t + 1) + 1)" ] 

[ "alatch( ( (t + 1) + 1) ♦ 1) = alatch((t + 1) ♦ 1)" ] 

[ "blatchC ( (t ♦ 1) + 1) ♦ 1) ■ blatch((t + 1) ♦ 1)" ] 

[ "ireq_ff(((t ♦ 1) ♦ 1) ♦ 1) » ireq_ff((t + 1) + 1)" ] 

[ "iack.ff (((t + 1) + 1) + 1) = iack.ff ((t + 1) ♦ 1)" ] 

[ M mir(((t + 1) + 1) + 1) = 

(F,(T,T),(F,F,F,F),F,F,F,(T,F,T),(F,F,F),T,F), 
(F,F,F,F,F,F,F,F,F),(F,F,F,F),(F,F,T),F,F,F,F,F,F" ] 
[ "clk(((t + 1) ♦ 1) ♦ 1) - T,T" ] 


The only change in this phase is the new clock value. 


156 






The fourth phase is executed in the same manner, using PHASE_FOUR_LEMMA to 
obtain the state changes during the fourth phase. 


[ "reg((((t + 1) ♦ 1) ♦ 1) ♦ 1) * reg(((t ♦ 1) + 1) + 1)"] 

[ "psw((((t + 1) ♦ 1) ♦ 1) ♦ 1) ■ 

mk.psw rep 

(get_sm rep(psw(((t + 1) + 1) + 1)), 

get.ie rep(psw(((t + 1) + 1) + 1)), 

get.vf rep(psw(((t + 1) + 1) + 1)), 

get_nf rep(psw(((t + 1) + 1) + 1)), 

get.cf rep(psw(((t + 1) + 1) + 1)), 

get.zf rep(psv(((t + 1) + 1) + 1)))" ] 

[ "pc((((t + 1) + 1) + 1) + 1) = 

(JUHP.COND rep 

(reg.len rep(dest rep(ir(((t + 1) + 1) + 1)))) 

(psv(((t + 1) + 1) + 1)) => 
add rep(alatch(((t + 1) + 1) + 1), 

blatch(((t + 1) + 1) + l)) I pc(((t ♦ 1) + 1) + 1))" ] 

[ "mem((((t + 1) + 1) + 1) + 1) = men(((t + 1) ♦ 1) + 1)" ] 

[ "ivec((((t + 1) + 1) + 1) + 1) = ivec(((t + 1) + 1) + 1)" ] 

[ "ir((((t + 1) ♦ 1) + 1) + 1) = ir(((t + 1) + 1) + 1)" ] 

[ "mar((C(t + 1) + 1) + 1) + 1) = mar(((t + 1) + 1) + 1)" ] 

[ "mbr((((t + 1) ♦ 1) ♦ 1) ♦ 1) ■ mbr( ( (t + 1) + 1) + 1)" ] 

[ "mpc((((t ♦ 1) + 1) ♦ 1) ♦ 1) ■ 

MPCJJNIT (mpc(((t + 1) + 1) + l))(opcode rep(ir(((t + 1) + 1) + 1))) 

(F,F,F,F,F,F) (F,F,T) (ireq.ff (((t + 1) + 1) + 1)) 

(get.ie rep(psw(((t + 1) + 1) + 1))) 

(get.sm rep(psw(((t + 1) + 1) + 1)))" ] 

[ "alatch((((t + 1) ♦ 1) ♦ 1) + 1) * alatch(((t + 1) + 1) + 1)" ] 

[ "blatch((((t + 1) + 1) + 1) + 1) = blatch(((t + 1) + 1) + 1)" ] 

[ "ireq.ff ((((t + 1) + 1) + 1) + 1) * ireq.ff(((t + 1) + 1) + 1)" ] 

[ "iack.ff ((((t ♦ 1) + 1) ♦ 1) + 1) » iack.ff (((t + i) + 1) + 1)" ] 

[ "mir((((t + 1) + 1) + 1) + 1) * 

(F, (T,T) , (F,F ,F,F) ,F,F,F, (T,F,T) , (F,F,F) ,T,F) , 
(F,F,F J F,F,F,F,F,F),(F,F,F,F),(F,F,T),F,F,F,F,F,F" ] 

[ "clk((((t + 1) + 1) + 1) + 1) * F,F" ] 


In the fourth phase, the program counter is finally updated with the new value 
(provided the jump condition is true). The clock returns to (F,F), signalling that 
we axe through. 

The assumption list now contains the step-wise changes for each phase in the 
phase-level instruction sequence for the JMP.ul microinstruction. We can solve the 
goad by rewriting with the assumptions and a few auxiliary lemmas. 

The symbolic execution technique cam be used to prove each of the instruction 
correctness le mm as for the 64 microinstructions. Using the instruction correctness 
le mm as, we can prove that every instruction in the microinstruction list is correct. 
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Micro_Int_CORRECT_LEMMA * 

I- (V p. nk.psw rep 

(get.sm rep p,get_ie rep p,get_vf rep p, 
get.nf rep p.get.cf rep p,get_zf rep p) = p) => 
EVERY (Micro_Int_Inst_Correct rep 

(A t. (reg t,psw t,pc t,mem t, 

ivec t,ir t,mar t,mbr t.mpc t, 

alatch t, blatch t, ireq.ff t, iack.ff t, 

mir t, micro.rom, elk t)) 

(A t. (int.e t))) (micro_inst_list rep) 


The Length Lemma. The length lemma in the micro-level is similar to the 
length lemma at the phase-level. The only difference is that the keys are represented 
by boolean 6-tuples, so there are many more cases to consider. 


Micro_Int_LENGTH_LEMMA = 

I- bt6_val mpe < (LENGTH (micro_inst_list rep)) 


The Order Lemma. The order lemma at the micro-level is also similar to the 
order lemma at the micro-level. Again the number of cases is greater, but the proof 
is straightforward. 


Micro_Int_ORDER_LEMMA = 

I- mpe = (FST (EL (bt6_val mpe) (micro_inst_list rep))) 


Instantiating the Correctness Theorem. After we have established the in- 
struction correctness lemma, the length lemma, and the order le mm a for the micro- 
level, we are ready to instantiate the generic interpreter theory for the micro-level. 
Figure 5.13 shows the instantiation. The variable rep (in the generic theory) gets the 
concrete representation shown in Table 5.22, s’ gets the phase-level state stream, 
and e ’ gets the phase-level environment stream. 


The Final Result. After the instantiation is complete and some minor rewriting 
and beta reduction, the correctness lemma for the micro-level becomes 
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let theorem.list = 

instantiate.abstract.theorems 
‘gen. I' 

[Micro.Int.CORRECT.LEMMA; 

Micro. Int.LENGTH.LEMMA ; 

Micro.Int.ORDER.LEMMA] 

[ 

("rep : ~I_rep_ty" , 

"(micro.inst.list rep, 
bt6_val, 

GetMPC, 

Phase.Substate rep, 

I, 

Phase.Int rep, 

GetPhaseClock rep, 

PhaseClockBegin) ") ; 

("e’ :time’->*env’ " , 

"(A t. int.e t)") ; 

("s’ :time->*state’", 

"(A t. (reg t,psw t,pc t,mem t, 

ivec t,ir t,mar t,mbr t,mpc t, 

alatch t, blatch t, ireq.ff t, iack.ff t, 

mir t, micro.rom, elk t))") 

] 

'MICRO' ; ; 


Figure 5.13: Instantiating the abstract theory for the micro-level. 
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MICRO.LEVEL.CORRECT.LEMMA = 

I- (V p. mk.psw rep 

(get.sm rep p.get.ie rep p,get_vf rep p, 
get.nf rep p,get_cf rep p.get.zf rep p) = p) =$> 
Phase. Int rep 

(A t. (reg t.pss t,pc t.mem t.ivec t.ir t, 
mar t,mbr t.mpc t.alatch t,blatch t, 
ireq.ff t.iack.ff t.mir t.micro.rom.clk t)) 


(A t. (int.e t)) A 
(3 t. elk t = F,F) => 

Micro.Int rep 

((At. (reg t.psw t,pc t.mem t.ivec t, 
ir t,mar t,mbr t,mpc t)) o 
(Temp_Abs(A t. elk t * F.F))) 

((At. (int.e t)) o 
(Temp_Abs(A t. elk t ■ F.F))) 


The lambda expression 

(A t. (reg t,psv t,pc t.mem t.ivec t,ir t.mar t.mbr t.mpc t)) 

in the above theorem models a state vector that is a function of time, that is a 
state stream. It is important to note, however, that this expression represents a 
data abstraction of the phase-level state stream and thus is not a micro-level state 
stream until it is composed with the temporal abstraction function 

(Temp Jibs (A t. elk t * F.F)) 

which maps micro-level time onto phase-level time. 

The correctness result also contains the assumption 

(3 t. elk t = F.F) 

This assumption must be met for the correctness result to be valid. That is, unless 
we can guarantee that at some time the clock will be at the beginning of its cycle, we 
cannot say that the computer will function correctly. Of course, we can guarantee 
this using a reset button. 

It is useful to compare the proof at this level with what would have happened 
had the phase-level specification not been used. We could have still proven the 
instruction correctness predicate for the microinstructions, but the form of the proof 
would have been quite different. Instead of being able to write a single tactic that 
uses symbolic execution to verify the instruction correctness lemma, the irregular 
types of proof done for the phase-level would have had to have be done for each 
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of the 64 instructions at the micro-level. The proof of the phase level is the most 
difficult one because of the length of the terms and its irregularity; having to repeat 
it 64 times would have made the proof intractable. The explicit specification of 
the phase-level is vital to the successful completion of a large microprocessor proof 
because it places a firewall between the structural specification of the electronic 
block model and the large instruction case explosion of the upper levels. 

We should also point out that even though the size of the microrom was fairly 
small (64 microwords), proofs containing larger microstores could be completed with 
very little extra human effort. Some effort could certainly be invested in making the 
symbolic execution faster; but, once the tactic to prove the instruction correctness 
lemma is written, the difference between proving 64 microinstructions or 512 is 
simply a matter of computer time. 

For even larger microstores, the proof would have to be restructured. In the proof 
presented here, we assumed that every word in the microrom was unique and used 
the position of the instruction in the microrom as the key. By fixing a set of microin- ' 
structions that are repeated often and using keys to identify identical instruction, 
much larger microroms could be verified. This amounts to a nanoprogramming 
level that may or may not reflect the actual structure of the machine. Only the 
instruction set would be verified at the micro-level and the microrom would not be 
used until the microprogram was needed to verify the macro-level. 


5.3.4 Verifying the Macro-Level. 

The goal of the macro-level verification is to show that the micro-level implements 
the macro-level. At this level, the micro-level specification becomes the implemen- 
tation and the macro-level interpreter is used as the abstract behavioral model. We 
want to show that under some small set of assumptions, the micro-level specification 
implies the macro-level specification. 

Table 5.23 gives the concrete functions used to instantiate the generic interpreter 
theory at this level. These functions were all defined in Section 5.2. 


The Definition. We define the macro-level in the usual manner, using the func- 
tion for instantiating abstract definitions from the abstract theory package. Using 
the concrete representation in Table 5.23 we produce the following specification of 
the macro-level interpreter. 
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Table 5.23: The functions used to instantiate the abstract represen- 
tation of the generic interpreter theory for the macro- 
level. 


Operation 

Instantiation 

instiist 

macro _inst .list 

key 

Opc.Val 

select 

Opcode 

substate 

Micro_Substate 

subenv 

I 

Impl 

Micro _Int 

clock 

GetMPC 

begin 

FETCH _ADDR 


Macro.Int = 
h Macro. Int rep s e * 

(V t. 

s(t + 1) * 

SND 

(EL(0pc_Val (Opcode rep(s t) (e t) ) ) (macro.inst.list rep)) 
(s t) 

(e t)) 


The Correctness Predicate. Just as we did at the phase-level and the micro- 
level, we instantiate the instruction correctness predicate for the macro-level. The 
instruction correctness predicate, once instantiated, says exactly what must be 
proven about the instructions at the macro-level to meet the theory obligations 
and instantiate the generic theory. 


Macro.Inst.Correct = 
b Macro.Inst.Correct rep s’ e’ p * 

Micro.Int rep s’ e* => 

(V t. 

(Opcode rep(Micro_Substate rep(s’ t))(e’ t) - FST p) A 
(GetMPC(s’ t)(e’ t) = F,F,F,F,F,F) => 

(3 c. 

Next (A t» . GetMPC(B’ t')(e’ t’) * F.F,F,F,F,F)(t,t + c) A 
(SND p(Micro_Substate rep (s’ t))(e’ t) * 

Micro.Substate rep(s’(t ♦ c))))) 


The Theory Obligations. We must satisfy the same three theory obligations 
at the macro-level as we did at the phase-level and micro-level. The instruction 
correctness lemma and the order lemma are a little more interesting at this level 
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than they were at the micro-level because of our use of coproducts to represent the 
keys. 


The Instruction Correctness Lemma. We wish to show that every instruc- 
tion in the macroinstruction set meets the instruction correctness lemma. The 
instruction list for the macro-level can be broken into two parts based on whether 
the key is a right or left injection to the coproduct used as the instruction key. If 
the key is a right injection, then the instruction is a pseudoinstruction. If it is a left 
injection, then the instruction is a user instruction. 

We develop a tactic that will prove the instruction correctness lemma for every 
instruction in the set. At the macro-level, however, there is only one pseudoinstruc- 
tion and so handling the pseudoinstruction as a special case makes more sense than 
developing a tactic general enough to solve both types of instructions. We will not 
deal with the proof of EINT here, but rather refer the interested reader to [Win90b] 

We do note, however, that the techniques used for solving the user instructions 
are similar to the method used to verify the pseudoinstruction. 

Every user instruction at the macro-level has the same three microinstructions 
in common for the first part of its execution cycle. The FETCH, ISSUE, and DECODE 
microinstructions are always executed in that order before microinstructions specific 
to a macroinstruction are executed. Because of this, we prove the following lemma 
which gives the state at time t + 3 as a function of the state at time t. 


FID.LEMMA = 

H Micro.Int rep (A t. (reg t,psw t,pc t,mem t.ivec t, 

ir t,mar t,mbr t,mpc t)) 

(A t. (int.e t)) => 

V t. (int.e t A get.ie rep (psv t) * F) A 
(mpc t = (F,F,F,F,F,F)) 

((reg(t + 3),psv(t + 3),pc(t + 3) J mem(t + 3),ivec(t + 3), 
ir(t + 3),mar(t + 3) f mbr(t + 3),mpc(t + 3)) - 
(reg t,psw t,inc rep(pc t),mem t,ivec t, 
fetch rep(mem t, address rep(pc t)),pc t, 
fetch rep (mem t, address rep (pc t)), 
add_bt6 (F,SND (opcode rep 
(fetch rep 

(mem t, address rep (pc t))))) "OFFSET)) A 
~(mpc(t + 1) = F,F,F,F f F,F) A 
~(mpc((t + 1) + 1) * F,F,F,F,F,F) A 
~(mpc(((t ♦ 1) ♦ 1) ♦ 1) - F,F,F,F,F,F)"), 


Using this lemma in the proof allows the FETCH — ISSUE — DECODE sequence to be 
symbolically executed in one step instead of three. Since we will do this for each of 
the 32 user instructions, this results in substantial savings in time. 
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Using the same strategy that we used at the micro-level to prove the instruction 
correctness le mm a through symbolic execution, we can prove the instruction cor- 
rectness lemma for each instruction at the macro-level. For example, here is the 
instruction correctness lemma for the first instruction in the list, JMP. 


MAC.INST.O = 

b (V m a. fetch rep (trans rep m,a) * fetch rep (n,a)) A 
(V m a x. store rep (trans rep m,a,x) = 

trans rep (store rep (m,a,x))) A 
(V m. int .fetch rep (int.trans rep m) * (int .fetch rep m)) 
Hacro.Inst.Correct rep 

(A t. reg t, psv t, pc t, mem t, ivec t, 
ir t, mar t, mbr t, mpc t) 

(A t. int.e t) 

(IIL(F,F,F,F,F) .ABS.ENV (JMP rep))" 


Note that the instruction correctness lemma is predicated on three assumptions: 


(V m a. fetch rep (trans rep m,a) * fetch rep (m,a)) 

(V m a x. store rep (trans rep m,a,x) * trans rep (store rep (m,a,x))) 
(V m. int.fetch rep (int.trans rep m) * (int.fetch rep m)) 


Recall that memory is shared state. The function trans is a memory transformation 
function that represents what other devices that share memory with the CPU are 
doing to memory. Using trans we can write a specification that allows changes to 
memory besides those resulting from CPU action. 

The first assumption says that fetching something from memory when another 
device is changing it is the same as fetching the same thing from memory when no 
changes are occurring. The second assumption says that the order of memory write 
operations is not important. In effect, these statements are assumptions of non- 
interference between the CPU and other devices that use memory. This is exactly 
what we want to have happen, of course, if we axe to say anything reasonable about 
the reliable operation of a system built using AVM-1. The third assumption is a 
si mil ax statement about the interrupt vector which is shared by the CPU and the 
interrupt controller. 

These three assumptions will appear in the final correctness result. When the 
correctness result for AVM-1 is used to verify some more abstract specification 
relating to the connection of the CPU chip with some other device that uses memory, 
these assumptions will have to be met to complete the verification. This kind of non- 
interference might be guaranteed with a hand-shaking protocol. The handshaking 
protocol would result in lemmas that would be used to discharge these assumptions. 

Using the individual results about each instruction in the list, we can prove the 
instruction correctness lemma for the macro-level. 
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Macro. Int. CORRECT. LEMMA * 

b (V m. int .fetch rep (int.trans rep m) = (int. fetch rep m)) A 
(V m a. fetch rep (trans rep m,a) * fetch rep (m,a)) A 
(V m a x. store rep (trans rep m,a,x) * 

trans rep (store rep (m,a,x))) => 

EVERY (Macro. Inst.Correct rep 

(A t. reg t, psv t, pc t, mem t, ivec t, 
ir t , mar t, mbr t, mpc t) 

(A t. int.e t)) 

(macro.inst.list rep) 


The Length Lemma. In the length lemma, the opcode variable opc has the 
type :bt5+one. The representation of the keys as coproducts makes the proof of 
the length lemma slightly more interesting than the proof of the length lemma for 
the other levels; but, not substantially more difficult. 


Macro. Int. LENGTH. LEMMA = 

b Qpc.Val opc < (LENGTH (macro.inst.list rep)) 


The Order Lemma. The proof of the order le mm a for the macro-level is also 
different from the proof of the order lemma for the other levels due to the coproduct 
representation of the keys. 


Macro.Int.ORDER.LEMMA = 

b opc = (FST (EL (Dpc.Val opc) (macro.inst.list rep))) 


Again, the result is not difficult to prove. 


Instantiating the Correctness Theorem. After the theory obligations for the 
macro-level have been established, we can instantiate the generic theory to provide 
a correctness result for this level. The concrete representation matches that of 
Table 5.23. The generic environment stream stream, e>, is instantiated with the 
micro-level environment stream and the generic state stream, s', is instantiated 
with the micro-level state stream. 


The Final Result. After the instantiation is complete, some minor rewriting and 
beta reduction lead to the final result for this level. 
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let theorem_list - 

instantiate.abstract .theorems 
‘gen.I' 

[Macro.Int.CORRECT.LEMMA ; 
Macro.Int.LENGTH.LEMMA ; 

Macro. Int _ ORDER. LEMMA] 

[ 

("rep:‘I_rep_ty" , 

"(macro.inst.list rep, 

Opc.Val , 

Opcode rep, 

Micro.Substate rep, 

I, 

Micro.Int rep, 

GetMPC , 'FETCH. ADDR) " ) ; 

("e* :time’->*env’ " , 

"(A t:time. int.e t)"); 

("s * :time->*state’ " , 

"(A t:time. reg t, psw t, pc t, mem t, 
ir t, mar t, mbr t, mpc t)") 

] 

‘MACRO*:; 


ivec t. 


Figure 5.14: Instantiating the abstract theory for the macro-level. 
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MACRO .LEVEL. CORRECT.LEMMA = 

h (V m. int.fetch rep(int_trans rep m) » int.fetch rep m) A 
(V m a. fetch rep(trans rep m,a) « fetch rep(m,a)) A 
(V m a x. store repCtrans rep m,a,x) = 

trans rep(store rep(m,a,x))) =>■ 

Micro.Int rep 

(A t. (reg t,psw t,pc t,mem t, 

ivec t,ir t,mar t,mbr t,mpc t)) 

(A t. (int.e t)) A 
(3 t. mpc t = F,F,F,F,F,F) =» 

Macro.Int rep 

((A t. (reg t.psw t,pc t, 

trans rep (mem t),int_trans rep (ivec t))) o 
(Temp_Abs(A t. mpc t = F,F,F,F,F,F))) 

((A t. (int.e t)) o 
(Temp_Abs(A t. mpc t = F,F,F,F,F,F))) 


The expression 


(Temp_Abs(A t. mpc t * F,F,F,F,F,F)) 


is the temporal abstraction function for the macro-level state stream. 


5.3.5 AVM-1 Is Correct. 

We have successfully instantiated the generic interpreter theory for each of the levels 
in our hierarchical decomposition. 

As discussed in Section 3.1, we can establish 

Iebm => Im aero 


in stages by showing 


I BBM lpha$ 


Ima 


We will use the correctness results from each of the levels and Modus Ponens to 
prove the correctness result for the entire CPU. 
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AVM.CORRECT = 

|- let micro.abs * Temp.Abs(A t. elk t * F,F) in 
let abs * micro. abs o 

(Temp_Abs(A t. (mpe o micro. abs)t * F,F,F,F,F,F)) in 
((V m. int. fetch rep (int. trans rep m) * int.fetch rep m) A 
(V m a. fetch rep(trans rep m,a) = fetch rep(m,a)) A 
(V m a x. 

store rep (trans rep m,a,x) - 

trans rep(store rep(m,a,x))) => 

(V p. mk.psw rep 

(get.sm rep p,get.ie rep p, 
get.vf rep p,get.nf rep p, 
get.cf rep p,get.zf rep p) = p) => 

EBM rep 

(A t. (reg t,psw t,pc t,mem t,ivec t,ir t, 

mar t,mbr t,mpc t.alatch t,blatch t, 
ireq.ff t,iack_ff t,mir t ,micro.rom,clk t)) 

(A t. (ireq.e t)) A 
(3 t . elk t = F,F) A 

(3 t. (mpe o micro. abs)t = F,F,F>F,F,F) 

Macro. Int rep 

((A t. (reg t,psv t,pc t, 

trans rep(mem t),int. trans rep(ivec t))) o abs) 

((At. (ireq.e t)) o abs)) 


We can make several points about the final correctness result for AVM-1: 

• The function abs, which is defined as 
micro_abs o 

(Temp.Abs(A t. (mpe o micro.abs) t * F,F,F,F,F,F)) 


where 


micro.abs = Temp. Abs (A t. elk t * F,F) 

is a temporal abstraction function that maps time at the macro-level to time 
at the electronic block model. 

• The assumption that the shared state operations are non-interfering and the 
assumption that the selectors and constructors on the program status word 
are consistent both appear in the final result. The first will be satisfied when 
the CPU is used correctly in conjunction with other devices. The second 
represents a constraint on the abstract word package (the only one). 

• We must also require that there is a time when the elk and the mpe axe at 
the beginning of their cycles. The composition of mpe with micro.abs further 
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requires that this time be congruent for both variables. As we mentioned 
earlier, both these assumptions can be met using a reset button. 

• The definition of EBM uses a variable to represent the microrom, urom. In 
the correctness theorem, EBM been specialized to use the program specified 
by micro_rom. The electronic block model only implements Macro.Int when 
coupled with a correctly written microprogram. 


5.4 Observations. 


Having completed the verification of AVM-1, we have several observations: 

The verification presented in this section has said nothing about whether the 
macro-level specification is any good. The specification could be wrong — that is, 
not correctly specify the behavior that the designer had in mind. All we have 
done is show that we have a machine that implements this behavior, not that it 
is the behavior we want. We could prove properties about the instructions. For 
example, we could show that calling a subroutine and then returning from it leaves 
the program counter with the correct value. We could also come up with a method 
of executing the specification so that it could be tested. While we have not done 
either of these in this dissertation, they would be important if we were going to 
implement AVM-1. 

The verification of AVM-1 is dependent upon the high-level specifications of the 
blocks in the electronic block model. In order to build AVM-1, of course, we would 
need to decide upon implementations for these blocks and show that these imple- 
mentations satisfy the behavioral requirements imposed upon them by the high- 
level specifications in the electronic block model. Thus, the verification of AVM-1 
is independent, in a sense, of the particular implementations used for the individual 
blocks. This gives the designer the flexibility to change the implementation of the 
blocks without affecting the verification. For example, a designer might include an 
adder with no look ahead or an adder with 4-bit look ahead depending on the power 
and space budgets for the chip. 

The proof of the instruction correctness lemma was done using a single tactic at 
the micro-level and another tactic at the phase-level. These tactics both operate 
through symbolic execution. Because of the great regularity imposed on the proofs 
of correctness by the generic interpreter theory, it should be possible to write a 
tactic which solves the instruction correctness lemma for any instantiation (provided 
that the implementation was an interpreter). This would be an important step 
since the instruction correctness lemma is the largest part of the effort involved in 
instantiating the theory. 
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The verification highlights the fact that the generic interpreter theory uses the 
same temporal abstraction for the environment and the state streams. This does 
not have to be so, but seems reasonable for our purposes. 
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Summary 


Chapter 6 


6.1 Summary of Major Results. 

This paper has described a theory of generic interpreters and shown how that theory 
can aid in the verification of a microprocessor. We believe that several important 
benefits accrue from our work. 

We have provided a methodology for verifying microprocessors that changes what 
has been primarily a research activity into an engineering activity. The generic 
interpreter theory structures the specification by stating what definitions must be 
made. The generic interpreter theory also structures the proof by stating what 
lemmas must be proven about those definitions. 

We believe that the structure provided by the generic interpreter theory, coupled 
with the savings afforded by the hierarchical decomposition strategy, make the 
verification of usable microprocessors a viable engineering activity. We are currently 
in the process of conducting an experiment that will test this hypothesis. We 
have begun a project to reverify VIPER using graduate students not familiar with 
HOL or microprocessor verification. We plan to complete the verification using 
less than 6 man-months of effort. The project is about two-thirds complete. A 
preliminary report describing the specification of the hierarchy and the verification 
of the hierarchy’s two lowest levels can be found in [Aro90]. 

We have demonstrated that a hierarchical decomposition of the specification can 
lead to an order of magnitude reduction in the number of difficult cases that must be 
considered to complete a microprocessor proof. If we had verified AVM-1 using the 
standard approach of directly establishing the macro-level from the electronic block 
model, we would have to prove 32 long, difficult instruction correctness lemmas. In 
the verification of AVM-1 Rom Chapter 5, the number of these le mm as was reduced 
to 4 due to the hierarchical decomposition. Machines with larger instruction sets 
or fewer cycles, provide the opportunity for even larger savings. 

In addition to reducing the number of difficult cases in a verification, the hierar- 
chical decomposition leads to proofs of the other cases that are readily automatable. 
We have shown that at each level in the hierarchy above the electronic block model 
a single tactic suffices for verifying the resulting lemmas. This regularity can be 
easily exploited in HOL using ML. 
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We have demonstrated how generic theories can be used to make the verification 
of hardware easier. Certainly, the generic interpreter theory is not the only useful 
generic theory. We found that the generic proof for the interpreter was considerably 
easier than the specific proofs reported in [Win90a]. Because of this, new models 
for various architectural features can be easily developed and catalogued. 

The generic proofs show exactly what a correctness statement for a micropro- 
cessor means. Because there is no superfluous detail cluttering up the definitions 
and theorems, we lire less likely to mistakenly think that we have proven that the 
microprocessor adds, for example, when we look at the generic proof. The final 
result is a simple statement of the correctness of an interpreter with respect to its 
implementation. 


IMPL.I.CORRECT * 

I- let s = (A t:time. (substate rep (s’ t))) and 
e = (A t:time. (subenv rep (e’ t))) and 
f * (A t:time. (count rep (s’ t) (e’ t) = 
(begin rep))) in 
let abs * (Temp.Abs f) in ( 

(Impl rep s’ e’) A (3 t. f t) => 

(INTERP rep) (s o abs) (e o abs)) 


The correctness theorem simply states that any true statement about the imple- 
mentation is similarly true about the abstract interpreter describing its behavior. 

Generic theories are a powerful mechanism for reusing theorems. We have demon- 
strated how a generic interpreter theory can be instantiated — saving the user from 
having to reverify a number of difficult theorems. The generic theory can be thought 
of as a structured library that not only provides useful theorems, but also provides 
a framework for using those theorems. For example, temporal and data abstrac- 
tion between the interpreter and its implementation are handled entirely within the 
generic interpreter theory; the user can define the temporal and data abstractions 
without having to explicitly prove theorems about them. 

We have provided the first, to our knowledge, microprocessor specification with 
provisions for shared state. The work reported in this dissertation does not compose 
the microprocessor specification with other specifications. However, the inclusion 
of the tr ansf ormation functions in the specification leads to conditions on the final 
result that have very satisfying interpretations. 


(V m a. fetch rep (trans rep m.a) * fetch rep (m,a)) 

(V m a x. store rep (trans rep m,a,x) = trans rep (store rep (m,a,x))) 


These assumptions amount to non-interference requirements between the memory 
actions of the CPU and the other devices in the system. The first assumes that a 
fetch will not be interfered with and the second assumes that a store will not be 
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interfered with. These assumptions must be met when the CPU is composed with 
other devices if we are to be able to say anything reasonable about the reliable 
operation of the system. The non-interference proofs are similar to critical regions 
in concurrent programming. Most likely, we would compose the devices using a 
handshaking protocol to prove that neither accesses the same location in memory 
at the same time. 


6.2 Future Work. 


The work presented here has shown how the generic interpreter theory and hier- 
archical decomposition strategy can be used in microprocessor verification. The 
success of this effort has led to us to begin exploring several related areas. 

The Computer Systems Verification Group at the University of California, Davis 
is designing and verifying a complete chip set including a memory management 
unit, an interrupt controller, a direct memory access controller, and a floating point 
coprocessor. Composing these and other devices will provide an important test of 
our methodology for specifying shared state. 

The specifications for the various levels in our hierarchy are all very regular due 
to the use of the generic interpreter theory. Imposing regularity in this way leads 
to possibility of writing tools that make use of this structure. 

• For example, we believe that a general tactic for verifying the instruction 
correctness lemmas could be written that would reduce the amount of effort 
on the part of the human verifier to simply writing the specification. 

• Another example of a general purpose tool that would benefit from the struc- 
ture imposed by the generic interpreter theory is a tool for executing the 
specifications. Being able to execute the specifications would eliminate the 
need for separate simulators (which may or may not match the specifications) 
for code development. 

• We believe that the regular structure imposed by the theory will also prove 
useful in connecting a verification system with a CAD system or a silicon 
compiler. Being able to link high-level functional verifications to low-level 
tools for implementation and design would greatly increase our confidence in 
a device. 

The generic models presented here were used exclusively in microprocessor ver- 
ification. Of course, microprocessors are not the only hardware devices that act 
like interpreters. For example, an interpreter theory can be used to describe co- 
processors, such as the floating point unit and the memory management unit. We 
are exploring general models of these devices and how these models relate to the 
generic interpreter theory presented here. 
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6.3 Conclusion. 


The goal of our work has been to make the verification of usable microprocessors 
tractable. In this dissertation we have described a strategy for hierarchically de- 
composing specifications that reduces the number of difficult cases by an order of 
magnitude. We have also described a generic theory useful for specifying and ver- 
ifying microprocessors. The generic theory structures both the specification and 
the verification. This structure not only says what has to be done, but provides 
a framework for building tools to further support the verification. The combina- 
tion of hierarchical decomposition and the generic interpreter theory represents a 
substantial improvement over past methods for verifying microprocessor designs. 
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Appendix A 


Abstract Theories in HOL 


A theory is a set of types, definitions, constants, axioms and parent theories. Logics 
are extended by defining new theories. A generic or abstract theory (we will use the 
two terms interchangeably) is parameterized so that some of the types and constants 
defined in the theory are undefined inside the theory except for their syntax and an 
algebraic specification of their semantics. Group theory provides an example of a 
generic theory from mathematics. The multiplication operator is undefined except 
for its syntax (a binary operator on type : group) and a semantics given in terms 
of the axioms of group theory. 

Generic theories are useful because they provide proofs about generic structures 
which can then be used to reason about specific instances of the structure. In groups, 
for example, after showing that addition over the integers satisfies the axioms of 
group theory, we can use the theorems from group theory to reason about addition 
on the integers. 

This appendix describes the use and documents the implementation of generic 
theories in the HOL theorem prover. The abstract theory package was not designed 
to be a final implementation of generic theories in HOL, but rather is seen as an 
interim solution until the system can be modified to provide them as full-fledged 
objects. The appendix describes how to use abstract theories in HOL and briefly 
describes the implementation of abstract theories in HOL. The implementation is 
interesting because it displays the flexibility of the HOL theorem prover. 


A.l Abstract Theories. 


The key components of an generic theory are a set of abstract objects and a set of 
abstract operations. This abstract representation is unspecified, that is, we don’t 
know (inside the theory) what the objects and operations mean. Their meaning is 
specified through a set of predicates that define relationships among members of the 
abstract representation. The theory describes a model. Any structure with objects 
and operations that satisfy the predicates is a homomorphism of that model. 

The theory obligations axiomatize the theory. Using the obligations as axioms 
allows us to prove theorems of interest about the abstract objects and operations. 
Our goal is to be able to instantiate the generic theory with a concrete represen- 
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tation meeting the obligations. The instantiation specializes the generic theorems, 
resulting in a set of theorems about the concrete representation. The concrete rep- 
resentation is an instance of the generic theory and represents a member of the class 
of abstract objects that it describes. 

An generic theory consists of three parts: 

1. An abstract representation where the abstract operations and their types are 
declared. 

2. A list of theory obligations defining the relationships between members of the 
abstract representation. 

3. A collection of abstract theorems which are proven with respect to the obliga- 
tions. 

A. 1.1 Using the Abstract Theory Package. 

The remainder of this section describes the functions in the abstract theory package. 
Before beginning a abstract theory, the ML file abstract should be loaded. This 
sets up the co mm ands in the abstract package and modifies some of the standard 
HOL co mm ands to support its operation. 

One declares a new abstract theory in the same way that one declares a standard 
theory, using new.theory. One is free to use any of the standard HOL commands 
for manipulating a draft theory in their usual manner. For example, definitions can 
be done in the usual way using nev.def inition. 

A. 1.2 Abstract Representations. 

The abstract representation describes the abstract objects and operators in the 
generic theory. The abstract theory package defines new-abstract .representation 
for declaring the abstract representation. The function is applied to a list of pairs. 
The first member of the pair is a string giving the name of the abstract object and 
the second member of the pair is the type of the operator. There is no limit on the 
length of the fist. 

The system does not require that abstract objects be specifically declared. We 
represent abstract objects as type variables in HOL (denoted by a prepended as- 
terisk). Since HOL does not require that type variables be declared, we are free to 
use them wherever we wish. The declaration of abstract objects is implicit, being 
the set of type variables occurring in the abstract representation. 

The result of declaring a new abstract representation is a list of definitions. The 
definitions can usually be ignored. There is one exception: After declaring the 
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abstract representation, you should apply the function make.inst.thms to the re- 
sulting list of definitions if you intend to use the instantiation functions described 
later. This will automatically prove a lemma about each definition for use during 
any subsequent instantiations. This is not done in the declaration of the represen- 
tation to save time when abstract theories are being drafted. 

In order to use the abstract representation, we will need to know its type. The 
abstract package provides a function for determining the type of an abstract repre- 
sentation. The ML function abstract-type is applied to two strings. The first is 
the name of the abstract theory defining the representation and the second is the 
name of any of the objects in the representation. 

When one defines a constant in the abstract theory, by convention, the first 
argument to the constant will be a variable with the same type as the abstract 
representation. This variable must, in turn be the first argument to any of the 
abstract constants from the abstract representation used in the definition. Later, 
during instantiation, the definition will be applied to a concrete representation and 
the instantiation functions will replace the abstract constants with the appropriate 
concrete constants in the instantiation. 


A. 1.3 Theory Obligations. 

The theory obligations are declared using the ML function "theory-obligations. 
The function is applied to a list of HOL terms. Each term should represent an 
axiom concerning the abstract objects. These obligations will be available for use 
in the draft theory. The system will automatically add them to the assumption list 
when the standard HOL commands for declaring goals and proving theorems, such 
as set-goal, are used. The HOL command close-theory closes the current draft 
and after it has been issued, the system no longer automatically appends the theory 
obligations to the assumption list. 

One note on writing theory obligations: the representation variable and any vari- 
ables with abstract types that are to be instantiated must not be included in the 
universally quantified variables of any of the theory obligations. 

A. 1.4 Instantiating Theories. 

One makes use of a generic theory by instantiating it. The first step is to make the 
generic theory a parent of the draft theory using the ML function new.parent. 

HOL theories differentiate between definitions and theorems. In order to instan- 
tiate a theory we need to be able to instantiate both. Instantiating definitions is 
the easier of the two. By convention, the first variable in an abstract definition has 
the same type as the representation. To use this definition, the following steps are 
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performed: 


1. Make an auxiliary definition that uses the abstract definition and applies it to 
a concrete representation (an ordered n-tuple containing, in order, a concrete 
constant to instantiate each abstract constant in the abstract representation). 

2. Use the ML function instantiate_abstract_def inition to produce an in- 
stance of the abstract definition. This function is applied to three arguments 
The first is the name of the abstract theory where the abstract definition was 
defined. The second is the name of the abstract definition. The third is the 
name of the definition from step (1). 

The result of this instantiation is a theorem that defines a concrete instance of the 
abstract definition and makes no reference to the abstract definition. 

As part of drafting an abstract theory, one normally proves theorems about the 
abstract representation using the theory obligations as axioms. In addition, the 
theorems may make use of some of the abstract definitions in the abstract theory. 
When the abstract theory is used, we instantiate the theorems in it so that the 
theory obligations are discharged and the new concrete theorems stand on their 
own. 

The ML function instantiate_abstract_theorems instantiates all of the ab- 
stract theorems in the theory. The function takes four arguments: 

1. th — the name of the abstract theory where theorems reside. 

2. axion_list - a list of theorems that satisfy the theory obligations and thereby 
discharge the antecedents of the abstract theorems. 

3. tm_list - a list of term pairs that instantiate variables with concrete repre- 
sentations. The first term in the pair is the variable to instantiate and the 
second is the concrete representation. 

4. base - a name to prepend to newly created theorems. This is done to avoid 
name clashes with existing theorems. 


A. 2 Imp lenient at ional Considerations. 


This section briefly describes the principles behind the implementation of abstract 
theories used in this report. The section is not intended to provide a full discussion 
of the implementation, but rather to describe how the facilities of HOL were used 
to reason about generic theories. The ML code that implements the abstract theory 
package is contained in Section A. 4. 
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There are two features of HOL that allowed generic theories to be implemented 
without changing the HOL system. The first is higher-order logic which is necessary 
for implementing abstract representations. The second is the meta-language ML 
which allowed the theory obligations to be declared and used in the proofs. 

The idea of using n-tuples of functions to implement abstract representations in 
HOL is due to Jeff Joyce [Joy89a]. The idea is that abstract types can be represented 
by type variables and that abstract functions can be represented as selectors on a 
n-tuple. Each member of the tuple has the type of the corresponding member 
in the abstract representation. Since the abstract functions are selectors on the 
representation variable, we can use them in an abstract representation by applying 
them to the representation (thus producing the right type) and we can instantiate 
them by applying them to a n-tuple containing concrete functions. 

For example, suppose we declare an abstract representation containing three func- 
tions as follows: 


new_abstract_representation 

’[ 

(‘f * ,":*tl->*t2") 

(‘g‘ ,":*t2->*t3") 

9 

(‘h* :*t3->*tl") 


The abstract package described in this report creates a representation with the type 
:(*tl->*t2 # *t2->*t3 # *t3->*tl) 


The package also makes the definitions: 


b*/ 

V rep. 

f rep * FST 

rep 



V rep. 

g rep * FST 

(SND 

rep) 

• ~dcf 

V rep. 

h rep = SND 

(SND 

rep) 

l 

and proves the theorems 

— 1 

bfc/ 

V x y 

z. f (x.y.z) 

* X 



V x y 

z. g (x.y.z) 

* y 


b itf 

V x y 

z. h (x.y.z) 

* 2 
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The implementation of theory obligations depends on the use of sequents as the 
underlying structure for goals and theorems and the meta-language used to program 
the system. A sequent is a pair where the first member is a list of terms representing 
the assumption list and the second member is a term representing the conclusion. 
Goals are represented as sequents and transformed into theorems (which have the 
same structure) when they have a proof. 

The HOL system has three ML functions that are used for proof and goal man- 
agement. 


set_goal: goal -> void 

TAC.PROOF: (goal * tactic) -> void 

prove.thm: (string # goal # tactic) -> void 

The first sets a goal in the proof management system for subsequent interactive 
proof. The second proves the goal using the tactic (if it can). The third solves the 
goal using the tactic and saves the resulting theorem in the draft theory using the 
name given in the string. 

The function theory.obligations takes a single argument, a list of terms. This 
list of terms is saved in a variable. When one of the above ML functions for setting 
up a goal is called, the list of terms in the theory obligations is appended to the 
(usually null) list of terms in the assumption list of the goal. These terms appear 
on the assumption list and can be used to prove the goal. The theory obligations 
remain on the assumption list of any resulting theorem, serving as a reminder that 
the theorem cannot stand on its own. 


A.3 Limitations. 


There are several limitations to the abstract package that should be fixed if this 
package is not superseded by a full-fledged abstract theory implementation in the 
HOL system. 

• The entire abstract theory must be declared in one file. The primary problem 
is that there is no way for a theory to know whether or not it is an abstract 
theory. The theory obligations from an abstract parent are not available in an 
abstract child. This could be fixed by storing them in the theory and creating 
a special nev_parent command for recalling them. 

• The package only supports goal-directed proofs. Forward proof styles could be 
supported with a little more work. Some of the things necessary for supporting 
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multiple file abstract theories mentioned in the previous item would be used 
here as well. 
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A.4 Implementing Abstract Theories in HOL 


This appendix provides the complete source code for the implementation of abstract 
theories m HOL. 

% 


File: abstract. ml 

Description: 

Defines ML functions for defining generic structures. 

Author: (c) P. J. Vindley 1989 

Date: 29 DEC 89 


let new_abstract_representation 1st = ( 
letrec make_type 1st = 
null 1st => ":one M 

I let rest * make_type (tl 1st) in 
M • ~(snd (hd 1st)) # ‘rest" in 
let rep.type = make^type 1st in 
letrec make_def initions 1st n = 
null 1st => nil I 

let f = (make_def initions (tl 1st) (n+1)) and 
name = (fst (hd 1st)) and 
nterm = (int_to_term n) in 
letrec make_tuple_term n = 

(n=0) => M rep:~rep_type M | 
let f = make_tuple_term (n-1) in 
M SED ~f M in 

let tuple_term = M FST ~ (make. tuple. term (n-1)) 11 in 
let op.type = " : ~rep_type -> ~(snd(hd 1st))" in 
(name, 

" * rep: rep_type. * (mk_var(name, op_type)) rep = 
“tuple_term M ) . f in 

map new_def inition (make.def initions 1st 1)) 

? failwith 'new_abstract_representation' ;; 

let abstract.type th const = ( 
hd(snd(dest_type 

(snd(dest_const (hd(f ilter (\x. (fst o dest_const) x = const) 

(constants th)))))))) 

? failwith 'abstract.type' ; ; 

let *ake_inst_thas th_list = ( 
let is_FST_term t = 

f8t(de8t_const(fst(strip_comb t))) = ‘FST' in 
let is_SID_term t = 

f st (dest_const(f st(strip_comb t))) = 'SID' in 
let FST.COMV t = 
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il is.FST.term t then 

let op,pr = dest.comb t in 

let op, [tl;t2] = strip.comb pr in 

SPECL [tl;t2] ( 

IIST TYPE [((type.of tl), 

((type.of t2), ":***')] FST) 

else fail in 
let SID.COIV t = 

if is_SID_tenn t then 

let op,[tl;t2] = strip.comb (snd (dest_comb t)) in 
SPECL [tl ;t2] ( 

IIST.TYPE [((type.of tl) , 

((type.of t2) ,":**")] SID) 

else fail in 

let make.inst.thm th = C 

letrec MY DEPTH.COHV conv t = 

(SUB.COHV (HY.DEPTH.COHV conv) THEHC (TRY.COHV conv)) t in 
let rep .type = 

(snd(dest.var (rand (rand (rator (concl (SPEC.ALL th))))))) 
letrec make. spec .term tp n = ( 

if tp = ":one M then "y:one" else 
let new types = (snd(dest.type tp)) in 

let newlterm = make.spec.term (hd(tl new.types)) (n+1) in 
let term.str = concat 'elm' (string.of _int n) in 
(mk.var (term.str, hd new.types)), ~new.term M ) 

? failwith 'make.spec.term' in 
let spec.th = SPEC (make.spec.term rep.type 0) th in 
COIV.RULE ( (RAID_C0HV (MY_DEPTH_COIV SID.COIV)) THEIC 
(RAID.COIV FST.COIV) ) spec.th) in 
let save.thm.list basename th.lst — C 
letrec process.lst th.lst n = 
if null th.lst then [] else 

let name = concat basename ( string. of .int n) in 
(save.thm (name.hd th.lst)). process.lst (tl th.lst) (n+1) in 
(process. 1st th.lst 0)) in 

save.thm.list (current .theory 0) (map make.inst.thm th.list)) 

? failwith ‘make.inst.thms' ; ; 

let get.abstract.thms th.name = 
letrec retrieve.thms n = ( 

let name = (concat th.name (string.of _int n)) in 
(theorem th.name name) . (retrieve.thms (n+1))) ? □ in 
retrieve.thms 0;; 

let instant iat e.abstract.def inition th.name defnl defn2 = 
let th.list = get.abstract.thms th.name in 
(OHCE.REVRITE.RULE th.list 

(QICE.REVRITE.RULE [definition th.name defnl] defn2));; 




inst ant iat e.abs tract .theorems 


th 


— abstract theory where theorems reside 


axiom.list — list of theorems that discharge antecedents in 
abstract theorems 

list o f term pairs that instantiate free variables. 
The first term in the pair is the variable to 
instatiate and the second is the instantiation. 

name to prepend to newly created theorems 


y. 


let instantiate_abstract .theorems th axiom.list tm.list base = 
let abs.thms = get. abstract. thms th in 
letrec add.one_at.end p = ( 
let f ,s = dest.pair p in 

mk_pair(f .add.one.at.end s) ? mk_pair(f .mk.pair (s, M Cx:one.F M ))) in 
letrec build.type.list tm.pair.list = 
if null tm.pair.list then □ else 
let (gen.tm , spec.tm) = hd(tm.pair.list) in 
let alt.spec.tm = (add.one.at.end spec.tm) ? spec.tm in 
let type.list = snd( (match gen.tm spec.tm)? 

(match gen.tm alt.spec.tm)) in 
type.list € (build.type.list (tl tm.pair.list)) in 
let type.list = build.type.list tm.list in 
letrec GEI_FROM_LIST tm.pair.list thm = ( 
if null tm.pair.list then thm else 
let (gen.tm, spec.tm) = hd (tm.pair.list) in 
let gen.thm = (GEI gen.tm thm) in 
GEI.FROI.LIST (tl tm.pair.list) gen.thm) 

? thm in 

letrec SPEC.FROM.LIST tm.pair.list thm = ( 
if null tm.pair.list then thm else 
let (gen^tm, spec.tm) = hd(tm.pair.list) in 
let alt.spec.tm = (add.one.at.end spec.tm) ? spec.tm in 
let spec. thm = ((SPEC spec.tm thm) ? 

(SPEC alt.spec.tm thm)) in 
SPEC.FROM.LIST (tl tm.pair.list) spec.thm) 

? thm in 

let multi.mp thm alist = 

letrec multi.mp.aux thm alist = 
if mull alist then thm else 
let nev.thm = PROVE.HYP (hd alist) thm in 
multi.mp.aux nev.thm (tl alist) in 
DISCH.ALL (nulti.ap.aui (UIDISCH.ALL thm) alist) in 
let instantiate.one.thm thm = ( 

let undisch.thm = (DISCH_ALL thm) in 

let gen.thm ■ GEI.FROM.LIST tm.list undisch.thm in 

let inst.thm = IIST.TYPE (build.type.list tm.list) gen.thm in 

let spec.thm = SPEC.FROM.LIST (rev tm.list) inst.thm in 

let nev.thm = 

(PU1E.REHRITE.E0LE abs.thms spec.thm) in 
■ulti.mp nev.thm ariom.list) ? thm in 
letrec generate.names th.name n = 
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let name = concat th_name (string_of_int n) in 

(theorem th_name name); (name. (generate .names th.name (n+1))) ? [name] in 
let thjthms = (theorems th) in 
let real_thms = subtract th.thms 

(filter (\x: (stringtthm) . (mem (1st x) (generate_names th 0))) 

th.thms) in 

let new_base = concat base ‘ _ * in 
letrec make_save_list nt_list » 
if null nt_list then D else 
let name,thm = hd(nt_list) in 

(concat new.base name, instant i ate _one_thm thm) . 

(make_save_list (tl nt_list)) in 
make_save_list real.thms;; 


% set up obligation lists 7 % 

letref theory_obligation_list = □ : (term)list ; ; 

let new_theory_obligations tm_list = 
theory_obligat ion_list := tm_list;; 


7 . 

Modify the 
lists. 


standard commands so that they know about obligation 


%Prove and store a theorem*/. 

let prove_thm(tok, w, tac: tactic) = 

let gl.prf - tac ((□ t theory_obligation_list) ,w) in 

if null gl then savejthm (tok, prf [] ) 

else 

(message (‘Unsolved goals:*); 
map print _goal gl; 
print_newline() ; 

failwith (*prove_thm — could not prove * tok));; 


% TAC.PRDOF (g,tac) uses tac to prove the goal g 
let TAC.PROOF : (goal # tactic) -> thm = 
set_fail_pref ix ‘TAC_PR00F‘ 

(\(g» tac) . 

let new_g = ((fst g) C theory.obligation.list ,snd g) in 
let gl,p = tac new_g in 
if null gl then p[] 
else ( 

message ( ‘Unsolved goals : * ) ; 
map print _goal gl; 
print_newline() ; 
failwith ‘unsolved goals*));; 


%Set the top-level goal, initialize % 
let set.goal g = 
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let new_g = ((fet g) C theory_obligation_list, end g) in 
change_state (abs.goals (nev.stack new_g));; 

let g = \t. eet_goal(Q ,t);; 

let close_theory_orig = close_theory; ; 

let close_theory x = 

theory_obligation_li8t := □; 
close_theory_orig x;; 

let new_theory_orig = new_theory ; ; 

let new_theory x = 

theory_obligation_list := □; 
new_theory_orig x;; 
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Appendix B 


The Organization of the Proof 



This appendix presents the organization of the proof of AVM-1 in HOL. The p- 
pendix discusses the overall proof organization, gives a description of the theone 
making up the proof and gives some measurements of the complexity of th p 


B.l Proof organization 

The proof for AVM-1 contains more than 25 theories. This section presents the gen- 
eral proof organisation (the hierarchy of theories) and bnefly descnbes the contents 

of each theory. 

Figure B.l shows how the main theories of the proof of AVM-1 are related. This 
hierarchy shows avm.th as the child theory of a long ancestry that follows the 
hierarchical decomposition discussed in the body of this dissertation. e pic ure 
is not complete; there are many theories not shown. For example, aux.def .th 
the ancestor of almost every theory in the proof. 

The rest of this section gives a taxonomy of the major theories in the proof of 

AVM-1. 


Generic Interpreters. The generic interpreter theories include the synchronous 
model, the temporal abstraction theory, and the asynchronous model. 

• genJ-sync.th — Defines and verifies a synchronous version of the generic 
interpreter theory. 

• time^abs.th — Defines a temporal abstraction function and proves several 
useful lemmas concerning it. 

• genJ.th - Contains the generic definition of an interpreter used in the def- 
inition and proof of the various levels in AVM-1. 


Auxiliary Theories. There are a number of auxiliary theories that are used 
throughout the proof of AVM-1. 
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Figure B.l: The theory hierarchy for the proof of AVM-1. 

• aux.defs.th Contains the abstract definition for n-bit words. The defi- 
nition is accomplished using the functions in abstract. ml, the ML code for 
producing abstract theories. 

• aux.thms.th — Contains auxiliary definitions and theorems. The theory is 
an ancestor of many of the main theories in the proof. 

• jump_def.th — Contains the definition of the jump condition logic that is 
used at every level. 

• regs_def.th — Contains the definition of the register file. Several distin- 
guished registers are defined and the function for updating the register file is 
given. 


The Electronic Block Model. The electronic block model description depends 
on a number of theones. The definition makes use of a generic ALU that is subse- 
quently instantiated to define the ALU used in AVM-1. The shifter and micropro- 
gram counter are also defined separately. 

• muxl6_def.th — Contains the definition of a 16 input multiplexor that is 
used in the definition of the generic ALU theory. 

• gen_alu.th Contains the abstract definition and verification of a 16 func- 
tion ALU. 

• alu def.th — Contains the instantiation of the generic ALU theory presented 
in the last section for a specific set of functions. The correctness result is mean- 
ingless since the modules used to implement the functions are null modules. 
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This does not affect the validity of the proof presented here since only the 
definition is used in subsequent theories. A number of theorems about the 
ALU’s output are proven here and are used in subsequent proofs. 

• shifter_def.th — Contains the definition of a 4 function shifter that is used in 
defining the electronic block model. A number of theorems about the shifter s 
output are proven here and are used in subsequent proofs. 

• mpc.def.th — Contains the definition of the microprogram counter unit that 
is used in the definition of the electronic block model and the phase-level. 

• mpc.def.th — Contains the definition of the state selectors for the electronic 
block model. 

• block.def.th — This theory contains the definition of the electronic block 
model. The theory contains the definition of most of the blocks used to 
construct the electronic block model. 

The Phase-Level. This section presents the theories that define the phase-level 
interpreter. Also presented is the theory that verifies the phase-level interpreter 
with respect to the electronic block model. 

• ucode_aux.ini — Contains the ML code that defines the microcode assem- 
bler. No theory is created; the assembler is an ML program that creates the 
appropriate terms for a given program statement. 

• ucode.def.th — Defines the type for the microcode as well as a number of 
selector functions that return the various fields that make up a microinstruc- 
tion. 

• phase.def.th — Defines the abstract behavior of the 4 phase-level instruc- 
tions and gives several auxiliary definitions used in instantiating the abstract 
interpreter theory. 

• phase. th — Contains the correctness result for the phase-level. The result is 
obtained by instantiating the generic interpreter theory contained in gen.I . th. 

The Micro— Level. This section presents the theories that define the micro-level 
interpreter. Also presented is the theory that verifies the micro-level interpreter 
with respect to the phase-level interpreter. 

• micro.def.th — Defines the abstract behavior of the 64 micro-level instruc- 
tions and gives several auxiliary definitions used in instantiating the abstract 
interpreter theory. 
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• uinst.def.th — Defines the microinstructions and combines them together 
into the microrom. 

• micro. th — Contains the correctness result for the micro-level. The result 
is obtained by instantiating the generic theory gen_I .th. 


The Macro— Level. This section presents the theories that define the macro-level 
interpreter. Also presented is the theory that verifies the macro-level interpreter 
with respect to the micro-level interpreter. 


• macro _def.th — Defines the abstract behavior of the 32 macro-level instruc- 
tions and gives several auxiliary definitions used in instantiating the abstract 
interpreter theory. 

• macro.th — Contains the correctness result for the macro-level. The result 
is obtained by instantiating the generic theory gen_I .th. 


The Final Result. This section presents the theory that proves that AVM-1 is 
correct. The theory is the descendant of all of the theories presented earlier. 

• avm.th — Contains the correctness result for the microprocessor. The fi- 
nal result is obtained by combining the correctness results from phase. th, 
micro. th, and macro.th. 


B.2 Proof Metrics. 


Table B.l presents the run-times for the various theories in the proof on a SPARC- 
Station with 16 Mbytes of memory. The times are CPU seconds. The table also 
gives the number of primitive inferences required to run the corresponding ML script 
in HOL. We were using version 1.11 of HOL built using the Austin Kyoto Common 
Lisp compiler. 

The total time to run the proof was 208029.1 CPU seconds, or nearly 58 CPU 
hours. The proof took almost a week of elapsed time because the core images were 
quite large (as high as 29 Mbytes) and caused the operating system to thrash when 
garbage collecting. 

There are several files in the table that were not discussed in the last section. 
Due to size limitations, the files mk_mic_xl .ml and mk_mic_x2.ml were broken out 
of mkjnicro.ml and mk m ac_I.nl, mk mac-1 . ml, and mk_mac_2 . ml were broken out 
of mk-macro.ml. 
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Table B.l: Script run-times on a SPARCStation with 16M of memory. 


File Name 

Time (CPU sec.) 

Inferences 

def.aux.ml 

3070.7“ 

88 

mk.aux.ml 

1117.5 

33852 

defjegs.ml 

41.0“ 

14 

def.jump.ml 

507T 

4 

defjnacro.ml 

2373.5 

84 

mk_time.ini 

126.8 

7256 

mkJ.ml 

229.9 

11727 

def_micro.ini 

~ 7063.6“ 

48460 

defjnpc.ml 

6.4 

4 

def.ucode 

115.6 

50 

def.phase.ml 

915.2 

32 

def.muxl6.ml 

344.2 

29211 

mk.gen.alu. ml 

8038.4 

101155 

def.edu. ml 

2325T3H 

70815 

defjshift.ml 

129.0 

2891 

defjselect.ml 

1969.0 

43903 

def.block.ml 

1316.0 

14738 

mk phase . ml 

12818.4 

355161 

def.uinst 

568X 

107 

mk_mic.jd.ml 

54846.2 

1589683 

mk_mic_x2.ini 

| 51300.6 

1500604 

mk micro. ml 

13505.3 

295744 

mk mac J. ml 

688.3” 

3985 

mk_mac_l.ini 

16774. f 

389738 

mk mac _2. ml 

20256.1 

4 457606 

mk macro. ml 

7247.9 

200120 

mk_avm.ml 

790.9 

j 10031 


208029.1 

5167063 
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